NOAA-GSL / ExascaleWorkflowSandbox

Other
2 stars 2 forks source link

Add Parsl Apps for installing JEDI QG #93

Closed christopherwharrop-noaa closed 4 months ago

christopherwharrop-noaa commented 4 months ago

This PR adds Chiltepin Parsl Apps that wrap installation of the JEDI QG model. This serves as a first step toward addressing #92. Subsequent PRs will add additional wrappers that configure and execute various components of the QG model. A test is added in this PR that uses the Parsl wrapper to clone, configure, build, and install the minimal parts of jedi-bundle needed to use the QG model.

In order to accommodate the JEDI QG install Apps, the configuration had to be reworked to include three partitions: service, serial, and parallel. The service partition is for running Apps that need internet access (cloning and ecbuild). The serial partition is for running non-MPI Apps that may or may not be multithreaded (make). And the parallel partition is for Apps that execute parallel codes (running a forecast).

Additionally, to increase test efficiency, the Fortran mpi_pi.F90 test code was refactored to collect samples for 30 seconds instead of collecting a billion samples regardless of where it runs. This allows for more predictable test execution times and makes testing on GitHub runners faster.

christopherwharrop-noaa commented 4 months ago

@NaureenBharwaniNOAA - This is finally ready to go. There was some very confusing behavior happening on the Github runner that I could not reproduce in the container on my laptop but I finally figured it out. The number of cores for each Slurm cluster "node" needs to be no more than half of the cores on the Github runner in order for the tests to not hang. I think this has something to do with runner virtualization but I'm not sure. Since I now have 8 cores per Slurm "node", I have to use a runner with at least 16 cores. One of the tests runs a code across two Slurm nodes (on 16 cores total), so the total cores of the runner has to be 16 or more.

christopherwharrop-noaa commented 4 months ago

That's weird. I already ran it through black and isort and flake8. I'll take a look.

christopherwharrop-noaa commented 4 months ago

@NaureenBharwaniNOAA - Thank you for the feedback. Please take a look to see whether I've addressed the issues you raised.