ESMCI / ccs_config_cesm

CESM CIME Case Control System configuration files
3 stars 41 forks source link

Use the system-wide MPI wrapper script on Derecho #139

Closed sjsprecious closed 6 months ago

sjsprecious commented 6 months ago

This PR introduces the system-wide mpibind wrapper script on Derecho, which will handle the MPI-only and hybrid MPI/OpenMP configurations automatically. The performance between the MPI-only and hybrid MPI/OpenMP configurations for the F2000climo compset and f09_f09_mg17 grid is comparable on Derecho by using this script.

However, my tests indicate that the threading option in CAM does not work for the f19_f19_mg17 grid (no matter using the mpibind script or manually adding the MPI arguments to the mpiexec command). According to the error message shown below, this is likely a code issue rather than a problem from the MPI/OpenMP configuration.

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source
libpthread-2.31.s  000015466C91C8C0  Unknown               Unknown  Unknown
cesm.exe           0000000003859342  Unknown               Unknown  Unknown
cesm.exe           000000000385A2FE  Unknown               Unknown  Unknown
cesm.exe           0000000003859531  Unknown               Unknown  Unknown
cesm.exe           000000000240A789  advance_clubb_cor        2548  advance_clubb_core_module.F90
cesm.exe           0000000001BA127E  clubb_intr_mp_clu        3326  clubb_intr.F90
cesm.exe           00000000007A9903  physpkg_mp_tphysa        1716  physpkg.F90
cesm.exe           00000000007A7320  physpkg_mp_phys_r        1259  physpkg.F90
libiomp5.so        00001546682E4493  __kmp_invoke_micr     Unknown  Unknown
libiomp5.so        0000154668252533  Unknown               Unknown  Unknown
libiomp5.so        0000154668251470  Unknown               Unknown  Unknown
libiomp5.so        00001546682E51FF  Unknown               Unknown  Unknown
libpthread-2.31.s  000015466C9106EA  Unknown               Unknown  Unknown
libc-2.31.so       0000154667EA3A6F  clone                 Unknown  Unknown
sjsprecious commented 6 months ago

Thanks @jedwards4b . I tried to add --label --line-buffer after the mpibind script but it did not work. I think these options need to be added inside the script itself. I am checking with Rory now and see if he could update the script for it.

jedwards4b commented 6 months ago

Ideally the script would pass any option it didn't recognize downstream to the mpirun command.

sjsprecious commented 6 months ago

Agreed. I will pass this request to Rory as well.

sjsprecious commented 6 months ago

The mpibind script now takes additional arguments to the mpiexec command on Derecho, thanks to Rory's help.