HPSCTerrSys / TSMP2

CMake-based TerrSysMP
https://github.com/HPSCTerrSys/TSMP
1 stars 2 forks source link

HDF5 support for ICON #11

Open BerndSchalge opened 10 months ago

BerndSchalge commented 10 months ago

Currently, ICON is compiled without HDF5 support.

If one wants to use the dace and RTTOV interfaces, HDF5 is required. It would be good to have these libraries included in cmake.

These libraries have to be added: -lhdf5hl_fortran -lhdf5_hl -lhdf5_fortran -lhdf5

For the 2023 stages the LDFLAGS have to have this added: -Wl,-rpath -Wl,/p/software/juwels/stages/2023/software/HDF5/1.12.2-ipsmpi-2022a/lib

I have already compiled ICON this way "by hand" and can confirm that the compilation runs without errors.

kvrigor commented 10 months ago

Hi @BerndSchalge! HDF5 is a required dependency for ICON; thus I'd expect CMake to automatically add the necessary linker flags: https://github.com/HPSCTerrSys/eTSMP/blob/609d7437b7775144ef5e4581f894b5de3d82886a/cmake/BuildICON.cmake#L22-L24

Still it's possible that CMake didn't properly populate the HDF5_LIBRARIES variable. Have you verified this? You can discover the actual ICON build commands invoked by eTSMP by running

where ${BUILD_DIR} is the CMake build directory specified in Step 3 of the README.

BerndSchalge commented 10 months ago

I am not very familiar with cmake. What I found strange is that there is no dedicated FindHDF5.cmake in the cmake directory, similarly to netCDF or ecCodes, but perhaps this is optional.

If I check the configure as suggested I do find this bit: LIBS=-Wl,--as-needed /p/software/juwels/stages/2023/software/HDF5/1.12.2-ipsmpi-2022a/lib/libhdf5.so ...

So it seems some form of HDF5 support is enabled. But in the LDFLAGS there is no dedicated HDF5 entry. The full entry reads this: "LDFLAGS=-Wl,--copy-dt-needed-entries,--as-needed -L/p/project/detectrea/schalge1/TSMP_E/eTSMP/run/JUWELS_eCLM-ICON/OASIS3-MCT/lib -lpsmile.MPI1 -lmct -lmpeu -lscrip".

And it was at the linker stage where I had my issue, not during the actual compile. It is only an issue if --enable-rttov is used (which is required in that version of ICON if --enable-dace is used). One possibility I see is that the RTTOV library is not an .so but a .a library and that maybe causes problems with the linker flags.

kvrigor commented 10 months ago

interesting; it seems that ICON's ./configure script doesn't automatically convert LIBS parameters to LDFLAGS. I did a quick fix at branch bugfix-icon-ldflags -- could you check if it works? Also if you could share your ./configure line for your particular ICON build, that would be helpful.

there is no dedicated FindHDF5.cmake in the cmake directory, similarly to netCDF or ecCodes, but perhaps this is optional.

One of CMake's "selling points" is that it ships with Find*.cmake scripts (a.k.a package finder scripts) for most commonly used libraries, and HDF5 is one of them. Package finder scripts aren't perfect though; sometimes it returns incomplete information or simply fails to find the libraries for some reason. If you're curious here's the default implementation of FindHDF5.cmake.

BerndSchalge commented 10 months ago

The full configure command looks like this:

/p/project/detectrea/schalge1/TSMP_E/eTSMP/icon2.6.4_oascoup/configure CC=/p/software/juwels/stages/2023/software/psmpi/5.7.0-1-intel-compilers-2022.1.0/bin/mpicc FC=/p/software/juwels/stages/2023/software/psmpi/5.7.0-1-intel-compilers-2022.1.0/bin/mpif90 "CFLAGS=-O3 -gdwarf-4 -qno-opt-dynamic-align -ftz -march=native" "FCFLAGS=-O3 -I/p/project/detectrea/schalge1/TSMP_E/eTSMP/run/JUWELS_eCLM-ICON/OASIS3-MCT/include -I/p/project/detectrea/schalge1/RTTOV_DWD/juwels.intel/include -gdwarf-4 -march=native -pc64 -fp-model source -traceback -qno-opt-dynamic-align -no-fma" ICON_ECRAD_FCFLAGS=-D__ECRAD_LITTLE_ENDIAN "LDFLAGS=-Wl,--copy-dt-needed-entries,--as-needed -L/p/project/detectrea/schalge1/TSMP_E/eTSMP/run/JUWELS_eCLM-ICON/OASIS3-MCT/lib -lpsmile.MPI1 -lmct -lmpeu -lscrip" "LIBS=-Wl,--as-needed /p/software/juwels/stages/2023/software/HDF5/1.12.2-ipsmpi-2022a/lib/libhdf5.so /p/software/juwels/stages/2023/software/Szip/2.1.1-GCCcore-11.3.0/lib/libsz.so /p/software/juwels/stages/2023/software/zlib/1.2.12-GCCcore-11.3.0/lib/libz.so /usr/lib64/libdl.so /usr/lib64/libm.so /p/software/juwels/stages/2023/software/intel-compilers/2022.1.0/compiler/2022.1.0/linux/compiler/lib/intel64/libiomp5.so /usr/lib64/libpthread.so /p/software/juwels/stages/2023/software/libxml2/2.9.13-GCCcore-11.3.0/lib/libxml2.so /p/software/juwels/stages/2023/software/XZ/5.2.5-GCCcore-11.3.0/lib/liblzma.so /p/software/juwels/stages/2023/software/imkl/2022.1.0/mkl/2022.1.0/lib/intel64/libmkl_intel_lp64.so /p/software/juwels/stages/2023/software/imkl/2022.1.0/mkl/2022.1.0/lib/intel64/libmkl_intel_thread.so /p/software/juwels/stages/2023/software/imkl/2022.1.0/mkl/2022.1.0/lib/intel64/libmkl_core.so /p/software/juwels/stages/2023/software/imkl/2022.1.0/compiler/2022.1.0/linux/compiler/lib/intel64_lin/libiomp5.so -lpthread -lm -ldl -L/p/project/detectrea/schalge1/TSMP_E/eTSMP/run/JUWELS_eCLM-ICON/OASIS3-MCT/lib -lpsmile.MPI1 -lmct -lmpeu -lscrip -L/p/software/juwels/stages/2023/software/netCDF/4.9.0-ipsmpi-2022a/lib64 -L/p/software/juwels/stages/2023/software/netCDF-Fortran/4.6.0-ipsmpi-2022a/lib -lnetcdff -lnetcdf -L/p/project/detectrea/schalge1/RTTOV_DWD/juwels.intel/lib -lrttov13" MPI_LAUNCH=/usr/bin/srun --prefix=/p/project/detectrea/schalge1/TSMP_E/eTSMP/run/JUWELS_eCLM-ICON --disable-coupling --disable-ocean --disable-jsbach --enable-oascoupling --enable-ecrad --enable-dace --enable-rttov --enable-parallel-netcdf

I will try out the new branch, however it seems that I have to make a clean compile of ICON each time, and the configure step alone takes quite some time. I will let you know if that branch fixes my problem.

Update: I tried out the new system, but there is an issue during the ICON configure stage already very early. I get warnings about invalid host types and that I should use --build, --host, --target. This looks very much like a formatting issue of some command, as the invalid host types given are actually libraries.

kvrigor commented 9 months ago

Update: I tried out the new system, but there is an issue during the ICON configure stage already very early. I get warnings about invalid host types and that I should use --build, --host, --target. This looks very much like a formatting issue of some command, as the invalid host types given are actually libraries.

Sorry I missed this.. next time you can write another comment so I'd get notified by GitHub ;)

I have already compiled ICON this way "by hand" and can confirm that the compilation runs without errors.

Could you share a working script/snippet?

BerndSchalge commented 9 months ago

Sorry for the edit, I will use new comments in the future.

In the build directory of ICON i only changed the icon.mk file and only the part below "Linker flags and libraries".

This part now looks like this:

LDFLAGS= -Wl,--copy-dt-needed-entries,--as-needed -L/p/project/detectrea/schalge1/TSMP_E/eTSMP/run/JUWELS_eCLM-ICON/OASIS3-MCT/lib -lpsmile.MPI1 -lmct -lmpeu -lscrip -Wl,-rpath -Wl,/p/project/detectrea/schalge1/TSMP_E/eTSMP/run/JUWELS_eCLM-ICON/OASIS3-MCT/lib -Wl,-rpath -Wl,/p/software/juwels/stages/2023/software/netCDF/4.9.0-ipsmpi-2022a/lib64 -Wl,-rpath -Wl,/p/software/juwels/stages/2023/software/netCDF-Fortran/4.6.0-ipsmpi-2022a/lib -Wl,-rpath -Wl,/p/project/detectrea/schalge1/RTTOV_DWD/juwels.intel/lib -Wl,-rpath -Wl,/p/software/juwels/stages/2023/software/HDF5/1.12.2-ipsmpi-2022a/lib -lhdf5hl_fortran -lhdf5_hl -lhdf5_fortran -lhdf5 BUNDLED_LIBFILES= externals/mtime/src/.libs/libmtime.a externals/cdi/src/.libs/libcdi_f2003.a externals/cdi/src/.libs/libcdi.a externals/ecrad/libradiation.a externals/ecrad/libifsrrtm.a externals/ecrad/libutilities.a externals/ecrad/libifsaux.a LIBS= -Wl,--as-needed /p/software/juwels/stages/2023/software/HDF5/1.12.2-ipsmpi-2022a/lib/libhdf5.so /p/software/juwels/stages/2023/software/Szip/2.1.1-GCCcore-11.3.0/lib/libsz.so /p/software/juwels/stages/2023/software/zlib/1.2.12-GCCcore-11.3.0/lib/libz.so /usr/lib64/libdl.so /usr/lib64/libm.so /p/software/juwels/stages/2023/software/intel-compilers/2022.1.0/compiler/2022.1.0/linux/compiler/lib/intel64/libiomp5.so /usr/lib64/libpthread.so /p/software/juwels/stages/2023/software/libxml2/2.9.13-GCCcore-11.3.0/lib/libxml2.so /p/software/juwels/stages/2023/software/XZ/5.2.5-GCCcore-11.3.0/lib/liblzma.so /p/software/juwels/stages/2023/software/imkl/2022.1.0/mkl/2022.1.0/lib/intel64/libmkl_intel_lp64.so /p/software/juwels/stages/2023/software/imkl/2022.1.0/mkl/2022.1.0/lib/intel64/libmkl_intel_thread.so /p/software/juwels/stages/2023/software/imkl/2022.1.0/mkl/2022.1.0/lib/intel64/libmkl_core.so /p/software/juwels/stages/2023/software/imkl/2022.1.0/compiler/2022.1.0/linux/compiler/lib/intel64_lin/libiomp5.so -lpthread -lm -ldl -L/p/project/detectrea/schalge1/TSMP_E/eTSMP/run/JUWELS_eCLM-ICON/OASIS3-MCT/lib -lpsmile.MPI1 -lmct -lmpeu -lscrip -L/p/software/juwels/stages/2023/software/netCDF/4.9.0-ipsmpi-2022a/lib64 -L/p/software/juwels/stages/2023/software/netCDF-Fortran/4.6.0-ipsmpi-2022a/lib -lnetcdff -lnetcdf -L/p/project/detectrea/schalge1/RTTOV_DWD/juwels.intel/lib -lrttov13 -lhdf5hl_fortran -lhdf5_hl -lhdf5_fortran -lhdf5`

It is possible that there are some redundancies in there, but this worked. I have not yet tried the new 2024 stage.