NOAA-EMC / hpc-stack

Create a software stack for HPC's
GNU Lesser General Public License v2.1
30 stars 36 forks source link

hpc-stack on Hercules #521

Open BijuThomas-NOAA opened 1 year ago

BijuThomas-NOAA commented 1 year ago

Wondering if hpc-stack available on Hercules. The hurricane team would like to run HAFS on Hercules.

Thanks

climbfuji commented 1 year ago

spack-stack is available for testing

On May 5, 2023, at 8:46 AM, Biju Thomas @.***> wrote:

Wondering if hpc-stack available on Hercules. The hurricane team would like to run HAFS on Hercules.

Thanks

— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/hpc-stack/issues/521, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5C2RNNNA3BQHLFB6RTWM3XEUHFJANCNFSM6AAAAAAXXG6P44. You are receiving this because you are subscribed to this thread.

Hang-Lei-NOAA commented 1 year ago

Please try spack-stack installations. The hpc-stack do not have the compiler sets for hercules, unless it has similar compiler env settings to existing machines or you have to set it.

BijuThomas-NOAA commented 1 year ago

Thanks. Could somebody point out the documentation on how to use spack-stack on Hercules?

climbfuji commented 1 year ago

Hang on, the PR that updates the spack-stack documentation is still in the works ...

climbfuji commented 1 year ago

How to use spack-stack on hercules?

ALWAYS DO THIS

module purge
module use /work/noaa/epic-ps/role-epic-ps/spack-stack/modulefiles
module load ecflow/5.8.4-hercules
module load mysql/8.0.31-hercules

FOR INTEL, DO THIS

module use /work/noaa/epic-ps/role-epic-ps/spack-stack/spack-stack-1.3.1-hercules/envs/unified-env/install/modulefiles/Core
module load stack-intel/2021.7.1
module load stack-intel-oneapi-mpi/2021.7.1
module load stack-python/3.9.14
module available

FOR GNU, DO THIS

module use /work/noaa/epic-ps/role-epic-ps/spack-stack/spack-stack-1.3.1-hercules/envs/unified-env/install/modulefiles/Core
module load stack-gcc/11.3.1
module load stack-openmpi/4.1.4
module load stack-python/3.9.14
module available
BijuThomas-NOAA commented 1 year ago

@climbfuji Great! Thanks

climbfuji commented 1 year ago

I shall warn you, we haven't done a lot of testing with this new software stack yet.

BijuThomas-NOAA commented 1 year ago

module load stack-intel/2021.7.1 Lmod has detected the following error: The following module(s) are unknown: "stack-intel/2021.7.1"

climbfuji commented 1 year ago

It's there, and I just ran those commands successfully ... but I forgot to add the most important line above (I'll edit the comment):

ls -la /work/noaa/epic-ps/role-epic-ps/spack-stack/spack-stack-1.3.1-hercules/envs/unified-env/install/modulefiles/Core/stack-intel/2021.7.1.lua
module use /work/noaa/epic-ps/role-epic-ps/spack-stack/spack-stack-1.3.1-hercules/envs/unified-env/install/modulefiles/Core
module load stack-...
ulmononian commented 1 year ago

we have a draft pr in to add hercules support on the wm level. feel free to take a look at how i have the modulefile set for hercules here: https://github.com/ufs-community/ufs-weather-model/pull/1733. the fv3_conf files for hercules may also be of some use. however, this is not thoroughly tested yet, and the WM does not yet successfully run in coupled mode on hercules (at least in my testing). i am not sure about the hafs configuration there.

let me know if you have any questions! happy to help if i can.

BijuThomas-NOAA commented 1 year ago

@ulmononian Thank you. It is very helpful!

natalie-perlin commented 1 year ago

A new hpc-stack built on Hercules with the following compilers: intel-oneapi-compilers/2022.2.1 intel-oneapi-mpi/2021.7.1

The stack could be loaded as following:

module use /work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2022.1.2_hrcs/modulefiles/stack
module load hpc
module load hpc-intel-oneapi-compilers
module load hpc-intel-oneapi-mpi

Please see below a the modules in the stack when inquired using "module list" (UPDATED 05/16/2023, to include esmf/8.4.2, mapl/2.35.2-esmf/8.4.2, pio/2.5.10):

---- /work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2022.1.2_hrcs/modulefiles/mpi/intel-oneapi-compilers/2022.2.1/intel-oneapi-mpi/2021.7.1 ----
   atlas/ecmwf-0.24.1        fms/2022.04                      nemsio/2.5.4    (D)
   crtm/2.4.0                hdf5/1.10.6               (D)    nemsiogfs/2.5.3
   eckit/ecmwf-1.16.0        madis/4.3                        netcdf/4.7.4    (D)
   esmf/8.3.0b09             mapl/2.22.0-esmf-8.3.0b09        pio/2.5.7
   esmf/8.4.2-debug          mapl/2.35.2-esmf-8.4.2    (D)    pio/2.5.10      (D)
   esmf/8.4.2         (D)    ncdiag/1.0.0                     upp/10.0.10
   fckit/ecmwf-0.9.2         ncio/1.1.2                       wrf_io/1.2.0

---- /work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2022.1.2_hrcs/modulefiles/compiler/intel-oneapi-compilers/2022.2.1 ----
   bacio/2.4.1                          ip/3.3.3          (D)    prod_util/1.2.2
   bufr/11.7.0                          ip/4.0.0                 sfcio/1.4.1
   g2/3.4.5                             jasper/2.0.25     (D)    sigio/2.3.2
   g2c/1.6.4                            jpeg/9.1.0               sp/2.3.3
   g2tmpl/1.10.0                        landsfcutil/2.4.1        szip/2.1.1
   gfsio/1.4.1                          libpng/1.6.37            udunits/2.2.28  (D)
   gftl-shared/v1.5.0                   metplus/4.1.3            w3emc/2.9.2
   grib_util/1.2.4                      nccmp/1.8.9.0     (D)    w3nco/2.4.1     (D)
   gsl/2.7.1                     (D)    nco/5.0.6         (D)    yafyaml/v0.5.1
   hdf5/1.10.6                          nemsio/2.5.4             zlib/1.2.11     (D)
   hpc-intel-oneapi-mpi/2021.7.1 (L)    netcdf/4.7.4

Feel free to test it if possible. ESMF 8.4.2 likely needs to be used, not 8.3.0b09.

BijuThomas-NOAA commented 1 year ago

The tar command failed on Hercules compute nodes with the following environment:

module use /work/noaa/epic-ps/role-epic-ps/spack-stack/spack-stack-1.3.1-hercules/envs/unified-env/install/modulefiles/Core
module load stack-intel/2021.7.1                                                                     
module load stack-intel-oneapi-mpi/2021.7.1     

Here is the error message: tar: Relink /apps/spack-managed/gcc-11.3.1/intel-oneapi-compilers-2022.2.1-z2sjni66fcyqcsamnoccgb7c77mn37qj/compiler/2022.2.1/linux/compiler/lib/intel64_lin/libimf.so' with/usr/lib64/libm.so.6' for IFUNC symbol `sinf' /var/spool/slurmd/job01525/slurm_script: line 23: 6267 Segmentation fault (core dumped)

climbfuji commented 1 year ago

The tar command failed on Hercules compute nodes with the following environment:

module use /work/noaa/epic-ps/role-epic-ps/spack-stack/spack-stack-1.3.1-hercules/envs/unified-env/install/modulefiles/Core
module load stack-intel/2021.7.1                                                                     
module load stack-intel-oneapi-mpi/2021.7.1     

Here is the error message: tar: Relink /apps/spack-managed/gcc-11.3.1/intel-oneapi-compilers-2022.2.1-z2sjni66fcyqcsamnoccgb7c77mn37qj/compiler/2022.2.1/linux/compiler/lib/intel64_lin/libimf.so' with/usr/lib64/libm.so.6' for IFUNC symbol `sinf' /var/spool/slurmd/job01525/slurm_script: line 23: 6267 Segmentation fault (core dumped)

Sorry about that! Can you check if spack installed tar itself? What does which tar say?

BijuThomas-NOAA commented 1 year ago

which tar /usr/bin/tar

natalie-perlin commented 1 year ago

Re: hpc-stack installation According to the ESMF support, ESMF v8.4.2 is needed when using newer intel compilers (such as intel-oneapi-compilers) . The hpc-stack on Hercules has been updated to include esmf/8.4.2, mapl/2.35.2-esmf-8.4.2 module, as well as pio/2.5.10. (Some tests on another machine fail to compile with pio/2.5.7 built with newer compilers.)

Updating the earlier comment with the modules shown from "module list" : https://github.com/NOAA-EMC/hpc-stack/issues/521#issuecomment-1546015335

climbfuji commented 1 year ago

which tar /usr/bin/tar

We should probably create an issue in spack-stack for that so that we remember it.

BijuThomas-NOAA commented 1 year ago

Ok done. https://github.com/JCSDA/spack-stack/issues/584

climbfuji commented 1 year ago

Thanks very much!