UFS-dev PR#184 - Githubissues

grantfirl commented 1 month ago

Analogous to https://github.com/NCAR/fv3atm/pull/128

grantfirl commented 1 month ago

@mkavulich @dustinswales @scrasmussen @ligiabernardet @lulinxue All, I'm going to need your help getting this PR to work. With https://github.com/ufs-community/ccpp-physics/pull/160, the CCPP requires that MPI be used by the host model. Even though this is overkill for the SCM right now, we still need to implement it. I think that I've put in the simplest possible MPI implementation in the SCM in this PR; we're basically just importing mpi_f08, running mpi_init(), and setting the MPI communicator passed to the CCPP as the communicator represented by MPI_COMM_WORLD that is set up via mpi_init().

It seems to work and be able to run through a case OK, but there are a couple of things that we still need to address. The first is the run script. Previously, one could simply execute the executable by invoking it, but now, I think that we'll need to use srun, right? This is the way that it is currently set up in the run script anyway. One also needs to provide an account. I've hardcoded it to gmtb right now, just for testing, but I'm guessing that we should reference an environment variable in the run script for this. Do you agree, or is there a better way? It is also just setting the number of processes to 1 because the SCM code isn't actually doing anything with multiple processes. I think that we'll need to change some documentation due to this as well.

Secondly, we need to update the CI workflows to load/install MPI for the various compilers if we want CI to continue to work. I briefly looked at this, but figured that one of you could probably do this more efficiently.

I'm suggesting that if you're able to help to please submit a PR into my PR branch (ufs-dev-PR184). This is required to catch the SCM up to the latest physics in ufs/dev, so we need it for the release.

dustinswales commented 1 month ago

@grantfirl I'll take a stab at this later today. FYI, here are the changes needed to the CI for openmpi, https://github.com/NCAR/ccpp-framework/pull/551. I think @mkavulich has worked on this some in a personal branch. Mike, what is the status of this?

mkavulich commented 1 month ago

@grantfirl Sorry I missed today's physics meeting, I forgot to RSVP so my calendar didn't remind me!

@dustinswales The build and run is working fine in my existing branch for PR #477 (including framework updates that make MPI mandatory), but I don't have all of the same physics changes as this PR. When I test this PR (and when I attempt to bring in these changes to my own branch) I get argument mismatch compiler errors:

[ 25%] Building Fortran object ccpp/physics/CMakeFiles/ccpp_physics.dir/physics/MP/Ferrier_Aligo/mp_fer_hires.F90.o
/scratch1/BMC/hmtb/kavulich/CCPP/workdir/suite_names_update/PR_482/ccpp-scm/ccpp/physics/physics/MP/Ferrier_Aligo/mp_fer_hires.F90(90): error #6633: The type of the actual argument differs from the type of the dummy argument.   [MPICOMM]
       CALL FERRIER_INIT_HR(dtp,mpicomm,mpirank,mpiroot,threads,errmsg,errflg)
--------------------------------^
compilation aborted for /scratch1/BMC/hmtb/kavulich/CCPP/workdir/suite_names_update/PR_482/ccpp-scm/ccpp/physics/physics/MP/Ferrier_Aligo/mp_fer_hires.F90 (code 1)
make[2]: *** [ccpp/physics/CMakeFiles/ccpp_physics.dir/build.make:1635: ccpp/physics/CMakeFiles/ccpp_physics.dir/physics/MP/Ferrier_Aligo/mp_fer_hires.F90.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:230: ccpp/physics/CMakeFiles/ccpp_physics.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

I don't understand why this is happening though, as it seems as if all the appropriate changes have been made to change mpicomm to the appropriate "MPI_Comm" type. My build attempt is on Hera here: /scratch1/BMC/hmtb/kavulich/CCPP/workdir/suite_names_update/PR_482/ccpp-scm

@grantfirl @dustinswales Does this make sense to either of you?

dustinswales commented 1 month ago

@grantfirl Sorry I missed today's physics meeting, I forgot to RSVP so my calendar didn't remind me!

@dustinswales The build and run is working fine in my existing branch for PR #477 (including framework updates that make MPI mandatory), but I don't have all of the same physics changes as this PR. When I test this PR (and when I attempt to bring in these changes to my own branch) I get argument mismatch compiler errors:
[ 25%] Building Fortran object ccpp/physics/CMakeFiles/ccpp_physics.dir/physics/MP/Ferrier_Aligo/mp_fer_hires.F90.o
/scratch1/BMC/hmtb/kavulich/CCPP/workdir/suite_names_update/PR_482/ccpp-scm/ccpp/physics/physics/MP/Ferrier_Aligo/mp_fer_hires.F90(90): error #6633: The type of the actual argument differs from the type of the dummy argument.   [MPICOMM]
       CALL FERRIER_INIT_HR(dtp,mpicomm,mpirank,mpiroot,threads,errmsg,errflg)
--------------------------------^
compilation aborted for /scratch1/BMC/hmtb/kavulich/CCPP/workdir/suite_names_update/PR_482/ccpp-scm/ccpp/physics/physics/MP/Ferrier_Aligo/mp_fer_hires.F90 (code 1)
make[2]: *** [ccpp/physics/CMakeFiles/ccpp_physics.dir/build.make:1635: ccpp/physics/CMakeFiles/ccpp_physics.dir/physics/MP/Ferrier_Aligo/mp_fer_hires.F90.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:230: ccpp/physics/CMakeFiles/ccpp_physics.dir/all] Error 2
make: *** [Makefile:136: all] Error 2
I don't understand why this is happening though, as it seems as if all the appropriate changes have been made to change mpicomm to the appropriate "MPI_Comm" type. My build attempt is on Hera here: /scratch1/BMC/hmtb/kavulich/CCPP/workdir/suite_names_update/PR_482/ccpp-scm

@grantfirl @dustinswales Does this make sense to either of you?

@mkavulich Not sure what's happening on your end. I just opened a PR into this branch adding the openmpi to the CI tests, and replacing "srun -a gmtb" w/ "mpirun". I still have some tiny CI issues I'm debugging, which will be solved imminently.

mkavulich commented 1 month ago

@grantfirl I figured out the issue; I was compiling with all CCPP suites and the tests only compile for a subset. When you compile with all suites, it results in a failure. It seems like Ferrier_Aligo is not used in any of our tested CCPP suites (in SCM or ufs-weather-model), and so we didn't catch the failure before now. The question is, is this worth fixing at this stage? I think it is, but I also don't know what fix is needed...

dustinswales commented 1 month ago

@grantfirl I figured out the issue; I was compiling with all CCPP suites and the tests only compile for a subset. When you compile with all suites, it results in a failure. It seems like Ferrier_Aligo is not used in any of our tested CCPP suites (in SCM or ufs-weather-model), and so we didn't catch the failure before now. The question is, is this worth fixing at this stage? I think it is, but I also don't know what fix is needed...

Not worth it. Open an issue that the FA MP isn't working with openmpi f08 and we can ping the developer. The fact that this didn't get caught on the UFS side suggests that they don't test it either, so I don't think we need to prioritize this.

grantfirl commented 1 month ago

Thanks @dustinswales. I merged in the CI fixes. Since the run script is using mpirun, Hera requires us to submit a batch job or run on an interactive node. If I run in an interactive mode, I can successfully run through the rt_test_cases.py file.

mkavulich commented 1 month ago

@grantfirl Can we add a variable to control the run command? I worry that mpirun will not be supported on all platforms.

grantfirl commented 1 month ago

For future reference, for Hera interactive mode, I used: salloc -q debug -t 0:15:00 --nodes=1 -A gmtb

If you don't, the run will fail without a good error message. I didn't even figure out what happened until I got an email from rdhpcs saying that using mpirun on a login node is prohibited.

grantfirl commented 1 month ago

@grantfirl Can we add a variable to control the run command? I worry that mpirun will not be supported on all platforms.

Sure. What are the options? mpirun, srun, mpiexec?

grantfirl commented 1 month ago

We should also play with command line options for the those commands to make sure that we're getting enough written out to stderr/stdout to debug as effectively as before. I think that by default, I was getting way less useful debugging information using mpirun without other options.

grantfirl commented 1 month ago

Also, note from @dustinswales's PR that added CI fixes into this one that it seems that the MPI overhead is adding about 20% to the computation times on Hera, at least with my test case (using ARM_SGP_summer_1997_A case with HR3 suite).

mkavulich commented 1 month ago

@grantfirl Can we add a variable to control the run command? I worry that mpirun will not be supported on all platforms.

Sure. What are the options? mpirun, srun, mpiexec?

I was just thinking a free-form string that can be specified as an argument to the run script, with mpirun (or whatever is the most appropriate) as a default.

grantfirl commented 1 month ago

@dustinswales DEPHY CI test was fixed by adding a MPI_FINALIZE call. I knew that I had forgotten that.

grantfirl commented 1 month ago

@grantfirl Can we add a variable to control the run command? I worry that mpirun will not be supported on all platforms.

Sure. What are the options? mpirun, srun, mpiexec?

I was just thinking a free-form string that can be specified as an argument to the run script, with mpirun (or whatever is the most appropriate) as a default.

Is this what you had in mind? https://github.com/NCAR/ccpp-scm/pull/482/commits/81296f293a438d59b1300938ae28ee9e16c2dab6

grantfirl commented 1 month ago

Note, Heradocs say that srun is preferred to mpirun and that we may need to set an environment variable: https://rdhpcs-common-docs.rdhpcs.noaa.gov/wiki/index.php?title=Running_and_Monitoring_Jobs

setenv I_MPI_PIN_RESPECT_CPUSET disable # for csh/tcsh export I_MPI_PIN_RESPECT_CPUSET=disable # for bash type shells

It seems to work OK without these set, and I don't know what they do. The default is 'off', not 'disable'. Also don't know if those are equivalent.

mkavulich commented 1 month ago

@grantfirl Can we add a variable to control the run command? I worry that mpirun will not be supported on all platforms.

Sure. What are the options? mpirun, srun, mpiexec?

I was just thinking a free-form string that can be specified as an argument to the run script, with mpirun (or whatever is the most appropriate) as a default.

Is this what you had in mind? 81296f2

That looks great, exactly what I was thinking!

For the record, I was able to run most cases on Hera from the login node without having to submit a batch job. A few did fail though, I don't know if that's because of the login node or something else. I'm currently investigating.

mkavulich commented 1 month ago

For the record, I was able to run most cases on Hera from the login node without having to submit a batch job. A few did fail though, I don't know if that's because of the login node or something else. I'm currently investigating.

Just to update on this: it looks like there isn't a problem with mpirun on login nodes per se, but those jobs that were failing were killed by the Hera system-level process because they were using too much memory or other resources on the login nodes.

grantfirl commented 1 month ago

@mkavulich @dustinswales This is ready for review, then, I suppose.

grantfirl commented 1 month ago

For the record, I was able to run most cases on Hera from the login node without having to submit a batch job. A few did fail though, I don't know if that's because of the login node or something else. I'm currently investigating.

Just to update on this: it looks like there isn't a problem with mpirun on login nodes per se, but those jobs that were failing were killed by the Hera system-level process because they were using too much memory or other resources on the login nodes.

@mkavulich @dustinswales @scrasmussen

OK, this is good to know. Even if we're occasionally running into the failures (I guess that I happened to pick a particular memory-hungry combo in my initial test), we should probably address it. I think that our options are to 1) Recommend to start an interactive compute node session on HPCs before using the run script 2) recommending and better supporting batch scripts to run the SCM on HPC or 3) automatically doing 1 or 2 from the run script?

Option 3 seems user-friendliest but also will likely have the most ongoing maintenance. My preference is probably option 1. We should be encouraging folks to use compute nodes when using the SCM for real science anyway. Sure, you can maybe "get away with" using login nodes for some stuff, but they're really not supposed to be used for running models, no matter how simple.

I'm certainly open to suggestions here.

mkavulich commented 1 month ago

The changes looks good to me, but I am curious as to why the rt CI is failing. It looks like it's failing in the plotting script?

grantfirl commented 1 month ago

The changes looks good to me, but I am curious as to why the rt CI is failing. It looks like it's failing in the plotting script?

Oh, I didn't even notice. Ya, let's take a look.

grantfirl commented 1 month ago

@mkavulich @dustinswales Looks like https://github.com/NCAR/ccpp-scm/pull/482/commits/4e1ad0788c2186889cea73e97e2b4f7b2c0191f0 fixed the plotting. I'm guessing the previous merged PR (#483 ) didn't catch this because there were no baseline differences. This PR does change baselines and actually triggered plot generation.

mkavulich commented 1 month ago

@mkavulich @dustinswales Looks like https://github.com/NCAR/ccpp-scm/commit/4e1ad0788c2186889cea73e97e2b4f7b2c0191f0 fixed the plotting.

Looks good, I'll pull that into my branch as well. But now the Nvidia tests are failing at the CMake step due to not finding MPI, do they need openmpi as well or is there something nvidia-specific here?

grantfirl commented 1 month ago

@mkavulich @dustinswales Looks like 4e1ad07 fixed the plotting.

Looks good, I'll pull that into my branch as well. But now the Nvidia tests are failing at the CMake step due to not finding MPI, do they need openmpi as well or is there something nvidia-specific here?

I think that the NVidia workflow script needs to add MPI installation too. I think that we can deal with this in a followup PR, perhaps when the GF openACC bugfix comes in (because that also causes the same tests to fail).

dustinswales commented 1 month ago

@grantfirl @mkavulich The Nvidia compiler come with their own MPI, so we don't need to install anything else, but it will need some reconfiguring (in a follow up PR).

grantfirl commented 1 month ago

@dustinswales @mkavulich I'll merge this and https://github.com/NCAR/ccpp-physics/pull/1073 when approved, but I'll hold off on https://github.com/NCAR/fv3atm/pull/128 and https://github.com/NCAR/ufs-weather-model/pull/132 as discussed due to the regression testing updates needed.

grantfirl commented 1 month ago

@dustinswales @mkavulich The DEPHY CI test failed after updating the ccpp/physics submodule pointer, but succeeded upon re-running. I don't know what that is about, but I'm going to merge and see if that happens again in subsequent PRs.

mkavulich commented 1 month ago

@grantfirl I had noticed the same thing on my PR after I pulled in your latest changes, I didn't think anything of it since it wasn't showing up on your branch. Didn't realize you had seen that happen too.

grantfirl commented 1 month ago

@grantfirl I had noticed the same thing on my PR after I pulled in your latest changes, I didn't think anything of it since it wasn't showing up on your branch. Didn't realize you had seen that happen too.

I have a hypothesis that I'm going to test.

NCAR / ccpp-scm

UFS-dev PR#184 #482