Closed GeorgeGayno-NOAA closed 1 year ago
I set up at case using GFSv16 netcdf data as input for a C1152 L128 grid. My test script and config file are on Dell: /gpfs/dell2/emc/modeling/noscrub/George.Gayno/ufs_utils.git/chgres_mem
Using 'develop' at 570ea39 required 8 nodes/6 tasks per node. Using the branch at f584c91 only required 4 nodes/6 tasks per node.
Will try additional tests.
Tried C3072 L65 on Dell. Using 'develop' required 30 nodes/6 tasks per node. Using the branch required 20 nodes/6 tasks per node.
Here is the error I get from 'FieldRegrid', which is solved my doubling the number of nodes:
Fatal error in MPI_Irecv: Invalid count, error stack:
MPI_Irecv(170): MPI_Irecv(buf=0x2b5d79127010, count=-219953152, MPI_BYTE, src=1, tag=0, comm=0x84000002, request=0x4835ba0) failed
MPI_Irecv(107): Negative count, value is -219953152
According to the ESMF group (@rsdunlapiv) this is the result of using 32 bit pointers in some ESMF routines
The ESMF group recommends a switch to ESMF v8.3 to help fix this. I just tried v8.3 on Hera using develop at f658c1e and all chgres regression tests passed. Will open an issue to upgrade to v8.3.
The ESMF group provided a test branch that fixes this - https://github.com/esmf-org/esmf/tree/feature/large-messages
I cloned and compiled this on Hera here: /scratch1/NCEPDEV/da/George.Gayno/noscrub/esmf.git/esmf
On Hera, I compiled 'develop' at 2a07b2c for use as the 'control'.
For the 'test', I compiled 'develop' using the update ESMF branch. This was done by modifying the build module as follows:
< setenv("ESMFMKFILE","/scratch1/NCEPDEV/da/George.Gayno/noscrub/esmf.git/esmf/lib/libO/Linux.intel.64.intelmpi.default/esmf.mk")
---
> esmf_ver=os.getenv("esmf_ver") or "8.2.1b04"
> load(pathJoin("esmf", esmf_ver))
The test case was a C1152 grid using 128 vertical levels. All config files and scripts are here: /scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/chgres_memory
Running the 'control' with 7 nodes/6 tasks per node, resulted in this error (see "log.fail.7nodes.develop"):
33: Fatal error in MPI_Irecv: Invalid count, error stack:
33: MPI_Irecv(170): MPI_Irecv(buf=0x2b367e533010, count=-1980497920, MPI_BYTE, src=33, tag=0, comm=0x84000002, request=0x5be22e0) failed
33: MPI_Irecv(107): Negative count, value is -1980497920
Rerunning with 8 nodes/6 tasks per node was successful. See "log.pass.8nodes.develop".
Running the 'test' (which used the update ESMF branch) was successful using only 5 nodes/6 tasks per node. See "log.pass.5nodes.new.esmf.branch".
So, using the new ESMF test branch eliminates the MPI error and reduces the amount of resources to run large grids.
Update from the ESMF team (Gerhard):
The large-message fix will be part of the upcoming v8.3.1 patch release. I will
let you know once it's released. Of course the fix will also go into ESMF develop
toward the 8.4 release.
ESMF v8.3.1 was officially released: https://github.com/esmf-org/esmf/releases/tag/v8.3.1
Anning Cheng was trying to create a C3072 L128 grid using the gdas_init utility on Cactus. The wind fields in the coldstart files were not correct. I was able to repeat the problem using develop at 711a4dc. I then upgraded to ESMF v8.4.0bs08, but the problem persisted. I ran with 8 nodes/18 tasks per node, and I requested memory of 500 GB. A plot of the problem is attached:
So, the way I create the ESMF fields for 3-d winds must have some other problems. As a test, I merged the latest updates from develop
to the bug_fix/chgres_memory
branch. I compiled 57792e3 on Cactus then reran the test in the previous comment. The wind fields looked correct.
Users occasionally get out-of-memory issues when running chgres_cube for large domains. Almost always, this happens during the regridding of the 3-D winds to the edges of the grid box.
https://github.com/ufs-community/UFS_UTILS/blob/570ea3966c125ead7e90b4342be2efc22a8b1f41/sorc/chgres_cube.fd/atmosphere.F90#L347
I suspect this is because the ESMF field for winds is 4-dimensional (x,y,z wind components in the vertical).
https://github.com/ufs-community/UFS_UTILS/blob/570ea3966c125ead7e90b4342be2efc22a8b1f41/sorc/chgres_cube.fd/atmosphere.F90#L632
Interpolating each wind component separately or as a field bundle would likely save memory.