esm-tools / esm_tools

Simple Infrastructure for Earth System Simulations
https://esm-tools.github.io/
GNU General Public License v2.0
25 stars 12 forks source link

Levante Experiences #578

Closed pgierz closed 2 years ago

pgierz commented 2 years ago

Please post any info and updates about using any of our models on the new DKRZ Levante system here!

pgierz commented 2 years ago

AWIESM 2.1

I will edit this comment here over time to provide updates.

I have made a branch based on feat/awiesm-final and merged (most) of the feat/levante in. There were some conflicts, in particular with fesom-2.1 that we will need to examine by hand.

So far, compile tests:

checking whether MPI_Send accepts const void * as first argument... yes
checking type of MPI_Aint... long
checking for mpirun... /sw/spack-levante/openmpi-4.1.2-yfwe6t/bin/mpirun
checking if /sw/spack-levante/openmpi-4.1.2-yfwe6t/bin/mpirun works... no
configure: error: in `/work/ab0246/a270077/SciComp/Model_Support/awiesm_porting/model_codes/awiesm-2.1/echam-6.3.05p2/yaxt':
configure: error: failed to run MPI programs
See `config.log' for more details
configure: error: ./configure failed for yaxt
seb-wahl commented 2 years ago

FOCI

I made a lot of changes (cleanup, ...) to the levante.yaml to be able to include FOCI. I structured it in the same way as we did for e.g. glogin.yaml. This stuff is not yet committed. Current status: FOCI (ECHAM6 + OASIS + NEMO) compiles (intelmpi) but crashes right upon startup with error messages that don't give me a hint what's going on. I contacted DKRZ and got the following reply:

Dear Sebastian,

there is currently one working setup known to me, but some testing by Kalle (in CC) and Michael Botzet from MPI-M is still needed before it can be released. Please, contact Kalle for more details. In case you are using Intel MPI instead of Open MPI, that is known to be problematic in some cases, since some patches from our side are still needed.

Kind regards, Irina Fast

28.03.2022 09:15 Wahl, Sebastian wrote:

Dear Support

Do you have (or know someone) who has run ECHAM6 and/or MPIESM (or any other coupled setup using OASIS) on levante? If yes would you or someone you know be willing to share their settings (compile flags, runtime environment variables, ...) and experiences?

While compilation is fine I’m currently cannot run our coupled setup FOCI (NEMO ocean with ECHAM6 atmosphere coupled via OASIS). It crashes right at the beginning without any useful error message.

Sebastian

seb-wahl commented 2 years ago

Just pushed my changes to feat/levante. FOCI is still not running but that might be due to the issues as MPI mentioned above.

mandresm commented 2 years ago

AWICM3

IO library problems

Compilation

Running

seb-wahl commented 2 years ago

Sorry I missed this one (I promised to do this on Monday). I just recompiled eccodes with -DENABLE_ECCODES_OMP_THREADS=ON. This version is now available at /work/bb0519/HPC_libraries/intel2022.0.1_impi2021.5.0_20220318/

mandresm commented 2 years ago

Thanks a lot Sebastian!

seb-wahl commented 2 years ago

After helpful input from Enrico from DKRZ (worked with him on the workshop) I was to run FOCI with OpenMP at top speed. As we have the cmake compile for ECHAM6 here are the compile flags (the most important ones are -march and -mtune: -O3 -fp-model source -qoverride-limits -assume realloc_lhs -align array64byte -no-prec-sqrt -no-prec-div -fast-transcendentals -m64 -march=core-avx2 -mtune=core-avx2 -fma -ip -pc64 I used the following recommended CPU layout for ECHAM6:

&parctl
    nproca = 32
    nprocb = 24
/
&runctl
   nproma = 8
/