Open emilyhcliu opened 1 year ago
@emilyhcliu EPIC has been supporting hpc stack on Orion at /work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2022.1.2. Regarding hpc-stack-gfsv16, there can be permission issue (EPIC side) to access /apps/contrib/NCEP/libs/hpc-stack/modulefiles/stack. By any chance, is migration of hpc-stack-gfsv16 possible to somewhere EPIC location ? @natalie-perlin FYI
@jkbk2004 GSI is having issue at run time when it is compiled with intel-2022. The issue is tracked in https://github.com/NOAA-EMC/GSI/issues/447. So, we can not use the libraries under intel-2022.
For GSI develop, we would like to move to EPIC HPC stacks (hopefully, we can resolve the issue with intel-2022) For GSI release/gfsda.v16 (currently used operational systems), we would like to stay with the hpc-stacks.
So, I think there are two options: (1) make intel-2018 also available under EPIC HPC (2) update the hpc-stack
Any thoughts?
@natalie-perlin can we add the hpc stack option on orion epic space for this gsi requirement?
@jkbk2004 @emilyhcliu - The intel-18 modules for the GSI may only be helpful as a debugging step, until the issue with higher-version intel compilers is solved. This however may not be a community-recommended approach of using different compilers to build different parts of the UFS Apps...
@natalie-perlin The GSI cannot run with intel 2021+ on any system until the above mentioned issue is resolved. I think everyone agrees that it would be ideal for everything to move to Intel 2022, but, unfortunately, this is not possible for the GSI yet. So all GSI dependencies, including CRTM, need to be compiled with Intel 18 for the time being on all systems.
@DavidHuber-NOAA - what about the cases with GNU compilers? EPIC supports software stacks with gnu compilers on Hera and Cheyenne that are built to support UFS-WM and UFS-SRW
@natalie-perlin I believe these would be required as well, though I can't say with certainty. I've only been helping with the Intel 2022 issue and am not an authority on the GSI otherwise.
The stack for GSI modules built with intel-2018.4 + impi/2018.4 compilers is ready on Orion
.
Modules that are listed in gsi_common.lua and
gsi_orion.lua are built.
The way to load:
module use /work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2018.4/modulefiles/stack
module load hpc/1.2.0
The lines 4-6 in gsi_orion.lua would then become:
prepend_path("MODULEPATH", "/work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2018.4/modulefiles/stack")
local hpc_ver=os.getenv("hpc_ver") or "1.2.0"
Update: Alternatives built: w3emc/2.9.1, w3emc/2.9.2
The identical stack is being built with intel-2022.1.2 compiler, which hopefully could be used for debugging purposes for the > intel/2020 compilers. (Fingers crossed)
Please let us know if you have any comments on the modules built or needed to be built.
HPC-stack with intel/2022.1.2 compiler on Orion
:
module use /work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2022.1.2_gsi/modulefiles/stack
module load hpc/1.2.0
@jkbk2004 - The crtm/2.4.0 fix files have been updated on all the NOAA RDHPC systems. The updated CRTM-2.4.0 code that does not have the excessive printout statements as mentioned in GSI Issue-556 has only been solved in the newer EPIC-maintained hpc-stacks that are based off netcdf-4.9.2. EPIC's set of stacks with netcdf-4.7.4 still use the library version built with excessive printouts. I'd like to update crtm/2.4.0 in these current stacks, as this was raised as an issue by the GSI team. When is the best time to do the update to avoid disruption to any RT testings (weekend, early mornings, after the PR-1745)? WM may move the the updated netcdf-4.9.2 -based stacks, as in https://github.com/ufs-community/ufs-weather-model/pull/1745 that are free of excessive printout. But other repositories, such as GSI, global_workflow, UFS_UTILS, SRW, etc, may still be using older, netcdf-4.7.4-based stack builds for some time.
@emilyhcliu @jkbk2004 - I will plan to fully update the CRTM-2.4.0 code with the new code that contains a bug-fix in all EPIC-maintained hpc-stacks that are built with netcdf/4.7.4 over the weekend, when it is unlikely to interfere with the WM and SRW tests. So far, the update has been done to the newer stacks built with netcdf/4.9.2.
A stack with theintel/2018.4
on Orion has just have been built recently (May 2, 2023), and it uses the recent code with the CRTM-2.4.0.:
/work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2018.4
The same is true for the limited-library stack build for the GSI team on Orion, with intel/2022.1.2 compiler, in
/work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2022.1.2_gsi
The crtm/2.4.0 stack update would require rebuilding a upp library as well, as a dependency on crtm. Will notify here when done.
All the active and current EPIC stacks have been updated with the latest CRTM/2.4.0 and corresponding CRTM fix files. Please see below the stack locations:
Hera intel/2022.1.2
: /scratch1/NCEPDEV/nems/role.epic/hpc-stack/libs/intel-2022.1.2,
/scratch1/NCEPDEV/nems/role.epic/hpc-stack/libs/intel-2022.1.2_ncdf492
Hera gnu/9.2.0:
/scratch1/NCEPDEV/nems/role.epic/hpc-stack/libs/gnu-9.2,
/scratch1/NCEPDEV/nems/role.epic/hpc-stack/libs/gnu-9.2_ncdf492
Orion intel/2022.1.2:
/work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2022.1.2, /work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2022.1.2_ncdf492/
Orion intel/2018.4:
/work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2018.4
Jet intel/2022.1.2:
/mnt/lfs4/HFIP/hfv3gfs/role.epic/hpc-stack/libs/intel-2022.1.2,
/mnt/lfs4/HFIP/hfv3gfs/role.epic/hpc-stack/libs/intel-2022.1.2_ncdf492
Jet intel/2018:
/mnt/lfs4/HFIP/hfv3gfs/role.epic/hpc-stack/libs/intel-18.0.5.274
Cheyenne intel/2022.1:
/glade/work/epicufsrt/contrib/hpc-stack/intel2022.1,
/glade/work/epicufsrt/contrib/hpc-stack/src-intel2022.1_ncdf492
Cheyenne gnu/10.1.0:
/glade/work/epicufsrt/contrib/hpc-stack/gnu10.1.0, /glade/work/epicufsrt/contrib/hpc-stack//gnu10.1.0_ncdf49
Gaea intel-classic/2022.2.1:
/lustre/f2/dev/role.epic/contrib/hpc-stack/intel-classic-2022.2.1
@DavidHuber-NOAA and @emilyhcliu Wondering if anybody tries to run GSI with spack-stack (stack-intel/2021.7.1) on Hercules. I have issues during running while it compiled successfully on Hercules.
@BijuThomas-NOAA No, I have not tried yet. The GSI does not yet run with Intel 2021+ (NOAA-EMC/GSI#447 NOAA-EMC/GSI#571), but I have it working on Orion and am actively working on it on Hera with an apparent communication problem on Hera. @natalie-perlin has gotten it to work on Gaea and is actively working on Cheyenne. After that, we could perhaps try out spack stack and then Hercules.
The crtm version 2.4.0 installed under hpc-stack: /apps/contrib/NCEP/libs/hpc-stack/modulefiles/stack is outdated and needs an update.
**Issue #517 is related to this issue.
Which software in the stack would you like installed? crtm version 2.4.0 and related coefficient files
What is the version/tag of the software? release/REL-2.4.0_emc
What compilation options would you like set? intel-2018.4
Which machines would you like to have the software installed? All HPC machines other than HERA HERA already updated.
Any other relevant information that we should know to correctly install the software??
Additional context Question: For ORION, the hpc-stack is the one under active maintenance, the hpc-stack-gfsv16 is not, correct?