Open junwang-noaa opened 2 months ago
Corresponing UFS weather model issues are:
https://github.com/ufs-community/ufs-weather-model/issues/2345
https://github.com/ufs-community/ufs-weather-model/issues/2346
There is a build problem that needs to be resolved by the teams:
MAPL 2.46.2/ESMF 8.6.1 (Hang)
This happens on all machines, not just WCOSS2.
Is there a new release of MAPL now? @Hang-Lei-NOAA can you try installing it?
I installed them and tested them with UFS last Thurday, the problem is still, if not manually link the ESMF to UFS.
I have sent emails to Alex, to ask him to add this to spack-stack 1.6.0 last Friday, which he operated. I did add extra temporary installations into his spack-stack installations, but cannot modify some existing files. Although Brian tested other temporary installations fine, my test on new esmf and mapl/2.46.3 are still the old problem. My additions are totally removable (chmod777).
On Mon, Aug 26, 2024 at 7:59 AM Edward Hartnett @.***> wrote:
Is there a new release of MAPL now? @Hang-Lei-NOAA https://github.com/Hang-Lei-NOAA can you try installing it?
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/WCOSS2-requests/issues/5#issuecomment-2310033137, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFCTRPKFXFTVHN2FUGLZTMKCBAVCNFSM6AAAAABMMP6P3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJQGAZTGMJTG4 . You are receiving this because you were mentioned.Message ID: @.***>
@DusanJovic-NOAA will test installation provided by @AlexanderRichert-NOAA on acorn.
@DusanJovic-NOAA will test installation provided by @AlexanderRichert-NOAA on acorn.
I do not have any information about Alex's installation on Acorn. I looked at linked ufs-weather-model issues. Where is it?
@Dusan Jovic - NOAA Affiliate @.***> That is the email I forwarded to you during my vacation. Chained env with ESMF/MAPL updates: /lfs/h1/emc/nceplibs/noscrub/ spack-stack/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46. 2/install/modulefiles/Core
On Fri, Aug 30, 2024 at 10:50 AM Dusan Jovic @.***> wrote:
@DusanJovic-NOAA https://github.com/DusanJovic-NOAA will test installation provided by @AlexanderRichert-NOAA https://github.com/AlexanderRichert-NOAA on acorn.
I do not have any information about Alex's installation on Acorn. I looked at linked ufs-weather-model issues. Where is it?
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/WCOSS2-requests/issues/5#issuecomment-2321504583, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFCDA2IAPYBQB4K5QQLZUCBEXAVCNFSM6AAAAABMMP6P3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRRGUYDINJYGM . You are receiving this because you were mentioned.Message ID: @.***>
@edwardhartnett @AlexanderRichert-NOAA Can you provide details on the installation either in this issue or in ufs-weather-model issue #2345? When is it installed and how to load the module? Without this information, we can't test ufs-weather-model.
@Hang-Lei-NOAA I think Ed said we need to test the library Alex installed. Also MAPL version is 2.46.3
@Jun Wang - NOAA Federal @.***> That is it is. As Alex mentioned in the email, these libraries have been added to spack-stack-1.6.0 as Dusan originally requested on acorn. Spack-stack: /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.6.0/ envs/ue-esmf-8.6.1-mapl-2.46.2/install/modulefiles/Core
I will also test the my new installations using Dusan's branch under /lfs/h1/emc/nceplibs/noscrub/hpc-stack/libs/hpc-stack/modulefiles/mpi/intel/19.1.3.304/cray-mpich/8.1.9
On Fri, Aug 30, 2024 at 10:58 AM Jun Wang @.***> wrote:
@Hang-Lei-NOAA https://github.com/Hang-Lei-NOAA I think Ed said we need to test the library Alex installed. Also MAPL version is 2.46.3
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/WCOSS2-requests/issues/5#issuecomment-2321527616, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFDVDJCN2LYBZOLRVX3ZUCCB7AVCNFSM6AAAAABMMP6P3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRRGUZDONRRGY . You are receiving this because you were mentioned.Message ID: @.***>
@edwardhartnett As discussed in ufs-wether-model issue #2345, the MAPL 2.46.2 does not work. But the library /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.2/install/modulefiles/Core is still using MAPL 2.46.2. Are you going to ask Alex install a new spack-stack version?
@Hang-Lei-NOAA I assume your installation is for wcoss2 testing since it is using hpc-stack, is it correct? Also is your testing working? Can you list the module file location and the test log? Thanks
@Jun Wang - NOAA Federal @.***> I will further inform Alex to check.
My test with Dusan's branch still have the issue with gocart: CMake Error at CMakeLists.txt:156 (find_package): No "FindESMF.cmake" found in CMAKE_MODULE_PATH.
/lfs/h1/emc/nceplibs/noscrub/Hang.Lei/works/dusanufs/modulefiles/ufs_acorn.intel.lua
On Fri, Aug 30, 2024 at 11:27 AM Jun Wang @.***> wrote:
@edwardhartnett https://github.com/edwardhartnett As discussed in ufs-wether-model issue #2345, the MAPL 2.46.2 does not work. But the library /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.2/install/modulefiles/Core is still using MAPL 2.46.2. Are you going to ask Alex install a new spack-stack version?
@Hang-Lei-NOAA https://github.com/Hang-Lei-NOAA I assume your installation is for wcoss2 testing since it is using hpc-stack, is it correct? Also is your testing working? Can you list the module file location and the test log? Thanks
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/WCOSS2-requests/issues/5#issuecomment-2321612434, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFGBHYJQSILNM2SCRV3ZUCFN7AVCNFSM6AAAAABMMP6P3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRRGYYTENBTGQ . You are receiving this because you were mentioned.Message ID: @.***>
@Dusan Jovic - NOAA Affiliate @.**> Alex update the spack-stack/1.6.0 on acorn: /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46. 3*/install/modulefiles/Core
On Fri, Aug 30, 2024 at 12:44 PM Hang Lei - NOAA Affiliate < @.***> wrote:
@Jun Wang - NOAA Federal @.***> I will further inform Alex to check.
My test with Dusan's branch still have the issue with gocart: CMake Error at CMakeLists.txt:156 (find_package): No "FindESMF.cmake" found in CMAKE_MODULE_PATH.
/lfs/h1/emc/nceplibs/noscrub/Hang.Lei/works/dusanufs/modulefiles/ufs_acorn.intel.lua
On Fri, Aug 30, 2024 at 11:27 AM Jun Wang @.***> wrote:
@edwardhartnett https://github.com/edwardhartnett As discussed in ufs-wether-model issue #2345, the MAPL 2.46.2 does not work. But the library /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.2/install/modulefiles/Core is still using MAPL 2.46.2. Are you going to ask Alex install a new spack-stack version?
@Hang-Lei-NOAA https://github.com/Hang-Lei-NOAA I assume your installation is for wcoss2 testing since it is using hpc-stack, is it correct? Also is your testing working? Can you list the module file location and the test log? Thanks
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/WCOSS2-requests/issues/5#issuecomment-2321612434, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFGBHYJQSILNM2SCRV3ZUCFN7AVCNFSM6AAAAABMMP6P3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRRGYYTENBTGQ . You are receiving this because you were mentioned.Message ID: @.***>
Compilation fails with this error:
Force 32-bit build for GOCART
CMake Error at GOCART/CMakeLists.txt:63 (find_package):
By not providing "FindGFTL_SHARED.cmake" in CMAKE_MODULE_PATH this project
has asked CMake to find a package configuration file provided by
"GFTL_SHARED", but CMake did not find one.
Could not find a package configuration file provided by "GFTL_SHARED" with
any of the following names:
GFTL_SHAREDConfig.cmake
gftl_shared-config.cmake
Add the installation prefix of "GFTL_SHARED" to CMAKE_PREFIX_PATH or set
"GFTL_SHARED_DIR" to a directory containing one of the above files. If
"GFTL_SHARED" provides a separate development package or SDK, be sure it
has been installed.
-- Configuring incomplete, errors occurred!
In current spack-stack, gftl-shared module is:
$ ll /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.6.0/envs/unified-env/install/modulefiles/intel/2022.0.2.262/gftl-shared/
total 4
-rw-r--r-- 1 alexander.richert nceplibs 1182 Jan 6 2024 1.6.1.lua
in ue-esmf-8.6.1-mapl-2.46.3 stack it is:
$ ll /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.3/install/modulefiles/intel/2022.0.2.262/gftl-shared/
total 4
-rw-r--r-- 1 alexander.richert nceplibs 1219 Aug 30 19:40 main.lua
According to the MAPL Spack recipe, versions 2.45.x and up require gftl-shared v1.8.0 and up. I can use v1.8.0 or v1.9.0, or I can chance it with 1.6.1 but no promises it wouldn't break anything.
Whatever, we just need to have exactly the same module version and the same name of the modules on all RDHPCS platforms and Acorn, because we use ufs_common.lua on all of them.
I also see that the current name of mapl module is mapl/2.46.2-esmf-8.6.1
while the new one is just mapl/2.46.3
. If we are changing the naming on Acorn, the new name must be used on all other machines.
Okay, I installed with gftl-shared@1.9.0, and I updated the module file to follow the mapl/xxx-emsf-xxx pattern.
Thanks.
I ran cpld_control_p8 test and it failed. I see these messages in the stderr file:
pe=00000 FAIL at line=01088 MAPL_CapGridComp.F90 <status=41>
pe=00000 FAIL at line=01088 MAPL_CapGridComp.F90 <status=41>
pe=00000 FAIL at line=01560 MAPL_EsmfRegridder.F90 <destination masking with this regrid type is unsupported>
pe=00000 FAIL at line=01382 MAPL_EsmfRegridder.F90 <status=1>
pe=00000 FAIL at line=00977 MAPL_AbstractRegridder.F90 <status=1>
pe=00000 FAIL at line=00097 NewRegridderManager.F90 <status=1>
pe=00000 FAIL at line=01101 GriddedIO.F90 <status=1>
pe=00000 FAIL at line=04539 ExtDataGridCompMod.F90 <status=1>
pe=00000 FAIL at line=01468 ExtDataGridCompMod.F90 <status=1>
pe=00000 FAIL at line=01838 MAPL_Generic.F90 <status=1>
pe=00000 FAIL at line=01241 MAPL_CapGridComp.F90 <status=1>
pe=00000 FAIL at line=01204 MAPL_CapGridComp.F90 <status=1>
pe=00000 FAIL at line=01164 MAPL_CapGridComp.F90 <status=1>
pe=00000 FAIL at line=00832 MAPL_CapGridComp.F90 <status=1>
pe=00000 FAIL at line=00972 MAPL_CapGridComp.F90 <status=1>
With updated GOCART (head of current develop branch), ufs-weather-model is still failing, this time with the error in SU2G_GridCompMod.F90:
pe=00136 FAIL at line=00193 SU2G_GridCompMod.F90 <status=41>
pe=00136 FAIL at line=04713 MAPL_Generic.F90 <status=41>
pe=00136 FAIL at line=04900 MAPL_Generic.F90 <status=41>
pe=00136 FAIL at line=01338 GOCART2G_GridCompMod.F90 <status=41>
pe=00136 FAIL at line=01316 GOCART2G_GridCompMod.F90 <status=41>
pe=00136 FAIL at line=00188 GOCART2G_GridCompMod.F90 <status=41>
This is probably due to how GOCART is configured in our regression tests.
I had the same error with non spack-stack installations.
On Thu, Sep 5, 2024 at 10:44 AM Dusan Jovic @.***> wrote:
With updated GOCART (head of current develop branch), ufs-weather-model is still failing, this time with the error in SU2G_GridCompMod.F90:
pe=00136 FAIL at line=00193 SU2G_GridCompMod.F90
pe=00136 FAIL at line=04713 MAPL_Generic.F90 pe=00136 FAIL at line=04900 MAPL_Generic.F90 pe=00136 FAIL at line=01338 GOCART2G_GridCompMod.F90 pe=00136 FAIL at line=01316 GOCART2G_GridCompMod.F90 pe=00136 FAIL at line=00188 GOCART2G_GridCompMod.F90 This is probably due to how GOCART is configured in our regression tests.
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/WCOSS2-requests/issues/5#issuecomment-2331885322, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWSMFDJKAAC4MXGVDJAYXTZVBU3DAVCNFSM6AAAAABMMP6P3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZRHA4DKMZSGI . You are receiving this because you were mentioned.Message ID: @.***>
@DusanJovic-NOAA and @Hang-Lei-NOAA is there an install of these versions that is working anywhere? That is, is there a successful case of these software packages working together?
OK, as a data point, I installed spack-stack-1.8.0 and it correctly installs the correct versions of netCDF (4.9.2), MAPL (2.46.3), and ESMF (8.6.1). Netcdf-c is installed with zstd and only one copy of the netCDF library is installed, and all other applications are using that one. So all that is good.
Email from Ed:
All,
There is a current issue on WCOSS2-requests: Install ESMF 8.6.1 and MAPL 2.46.2 -> 2.46.3.
Hang has installed the requested versions of ESMF and MAPL, all built and ESMF passed unit testing (MAPL has no tests). All are using netcdf-c-4.9.2.
When the UFS regression tests are run, there are failures with GOCART cases. See the issue for the exact description. This does not seem to be an installation issue, but a software issue. ESMF-8.6.1 and MAPL-2.46.3 are installed correctly. We have tested both with hpc-stack and spack-stack installs, with the same results. On orion, Brian has apparently encountered the same problems with this combination of software versions.
Hang has experimented and has found when the older MAPL version is used, the regression tests pass.
I'm not sure there is anything further our group can do on this issue. We have installed the software as requested, but cannot fix it, unfortunately. We understand that Brian and Dusan are following up with the MAPL team.
Please let us know if there is anything else we can do to help move this forward.
Thanks, Ed & Hang
Reply from Jun:
We need a bug fix from MAPL 2.46.3. At Monday's model infrastructure meeting, Barry agreed to take a look at the GOCART failure. Dusan transferred the test case to Hera, I just tagged Barry.
@AlexanderRichert-NOAA ue-esmf-8.6.1-mapl-2.46.3
environment on Acorn does not have g2/3.5.1
and g2tmpl/1.13.0
. Can you please add them.
Will do
@DusanJovic-NOAA, please try /lfs/h1/emc/nceplibs/noscrub/spack-stack/spack-stack-1.6.0/envs/upp-esmf-8.6.1-mapl-2.46.3/install/modulefiles/Core
@DusanJovic-NOAA did you find the versions you need?
For your tests, you can find new installations (esmf-8.6.1-mapl-2.46.3) on Orion and Hercules:
Hercules: /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.3/install/modulefiles/Core
Orion: /work/noaa/epic/role-epic/spack-stack/orion/spack-stack-1.6.0/envs/ue-esmf-8.6.1-mapl-2.46.3/install/modulefiles/Core
Included are new g2, g2tmpl and fms.
One way I think we went astray here is biting off too much at once.
Can we update ESMF to 8.6.1 and get that all resolved before we upgrade MAPL?
Install ESMF 8.6.1 and MAPL 2.46.2 after netcdf build with zstd is available on wcoss2.
The MAPL 2.46.2 has issues when running with UFS-weather-model. MAPL 2.46.3 has the fix, please install 2.46.3 with ESMF 8.6.1.
8/30/2024: To clarify, MAPL 2.46.3 needs to be installed with ESMF 8.6.1 in both spack-stack 1.6.0 and HPC-stack on Acorn for UFS weather model testing.