NOAA-GSL / ExascaleWorkflowSandbox

Other
2 stars 2 forks source link

Refactor CI workflows #27

Closed christopherwharrop-noaa closed 1 year ago

christopherwharrop-noaa commented 1 year ago

This update refactors the CI workflows to improve their efficiency and also to add critical tests for Parsl and Flux with MPI in a Slurm cluster environment. Previous CI workflows used a complex multi-stage container strategy to build the entire ExaWorks tool set (flux, radical, parsl, and stc) using Spack. This tested install scripts for those tools, but took many hours to run and didn't allow for testing of Parsl and Flux applications in a cluster environment. Because the previous CI tests are intractable and don't allow for the type of functional testing we need, they are being discarded. Additionally, we have also decided to focus efforts on integration of Parsl and Flux and have put explorations of Radical and Swift/T on the back burner for now.

We recently discovered that Parsl must be install with pip instead of Spack because that is the only way to obtain the later versions which are required when using it with Flux. We've also realized that we need to test with the Intel compilers. And so the installation strategy has changed. The new CI builds a containerized Slurm cluster that has the Intel OneAPI compilers, Spack, Flux, and Parsl installed into it. The CI then uses that container to run Parsl and Flux tests in a Slurm cluster environment with MPI. The test scripts are mounted via a volume into the container so that the container doesn't need to be updated as more tests are added.

christopherwharrop-noaa commented 1 year ago

@NaureenBharwaniNOAA - Thank you for the approval. Once the CI test passes, I will merge it. If more changes are necessary to make it work, I'll need to request another review.

christopherwharrop-noaa commented 1 year ago

@NaureenBharwaniNOAA - I finally figure out how to get this all to work correctly. The docker layer caching with Github Actions should all be working now. And the full MPI Parsl/Flux tests are passing too. Please take another look when you have a moment.