NOAA-EMC / hpc-stack

Create a software stack for HPC's
GNU Lesser General Public License v2.1
30 stars 36 forks source link

Cannot find -lhdf5_hl, -lhdf5 when building ufs-s2s-model prototype5.0 on stampede2 #21

Closed benjamin-cash closed 3 years ago

benjamin-cash commented 4 years ago

I installed hpc-stack on stampede2 and attempted to use it in place of NCEPLIBS-develop in my successful build of prototype5.0. It failed with the following:

mpiifort -o /work/02441/bcash/stampede2/s2s_p5_port/NEMS/exe/NEMS.x MAIN_NEMS.o module_NEMS_UTILS.o module_MEDIATOR_methods.o module_MEDIATOR.o module_MEDIATOR_SpaceWeather.o module_EARTH_INTERNAL_STATE.o module_EARTH_GRID_COMP.o module_NEMS_INTERNAL_STATE.o module_NEMS_GRID_COMP.o module_NEMS_Rusage.o nems_c_rusage.o /work/02441/bcash/stampede2/s2s_p5_port/CMEPS_INSTALL/libcmeps.a /work/02441/bcash/stampede2/s2s_p5_port/CMEPS_INSTALL/libcmeps_util.a /work/02441/bcash/stampede2/s2s_p5_port/CMEPS_INSTALL/libpiof.a /work/02441/bcash/stampede2/s2s_p5_port/CMEPS_INSTALL/libpioc.a /work/02441/bcash/stampede2/s2s_p5_port/WW3/model/obj_HYB/libww3_multi_esmf.a /work/02441/bcash/stampede2/s2s_p5_port/CICE-interface/CICE_INSTALL/libcice6.a /work/02441/bcash/stampede2/s2s_p5_port/MOM6-interface/MOM6_INSTALL/libmom.a /work/02441/bcash/stampede2/s2s_p5_port/MOM6-interface/MOM6_INSTALL/lib_ocean.a /work/02441/bcash/stampede2/s2s_p5_port/FV3/FV3_INSTALL/libfv3cap.a /work/02441/bcash/stampede2/s2s_p5_port/FV3/FV3_INSTALL/libccppdriver.a /work/02441/bcash/stampede2/s2s_p5_port/FV3/FV3_INSTALL/libfv3core.a /work/02441/bcash/stampede2/s2s_p5_port/FV3/FV3_INSTALL/libfv3io.a /work/02441/bcash/stampede2/s2s_p5_port/FV3/FV3_INSTALL/libipd.a /work/02441/bcash/stampede2/s2s_p5_port/FV3/FV3_INSTALL/libgfsphys.a /work/02441/bcash/stampede2/s2s_p5_port/FV3/FV3_INSTALL/libfv3cpl.a /work/02441/bcash/stampede2/s2s_p5_port/FV3/FV3_INSTALL/libstochastic_physics_wrapper.a /work/02441/bcash/stampede2/s2s_p5_port/FV3/FV3_INSTALL/libstochastic_physics.a /work/02441/bcash/stampede2/s2s_p5_port/FMS/FMS_INSTALL/libfms.a -L/work/02441/bcash/stampede2/s2s_p5_port/FV3/ccpp/lib -lccpp -lccppphys ENS_Cpl/ENS_Cpl.a /work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/nemsio/2.5.2/lib/libnemsio.a /work/02441/bcash/stampede2/intel-18.0.2/bacio/2.4.1/lib/libbacio_4.a /work/02441/bcash/stampede2/intel-18.0.2/sp/2.3.3/lib/libsp_d.a /work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/w3emc/2.7.3/lib/libw3emc_d.a /work/02441/bcash/stampede2/intel-18.0.2/w3nco/2.4.1/lib/libw3nco_d.a -L/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/esmf/8_1_0_beta_snapshot_27/lib -Wl,-rpath,/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/esmf/8_1_0_beta_snapshot_27/lib -lesmf -cxxlib -lrt -ldl -lnetcdff -lnetcdf -lhdf5_hl -lhdf5 -lz -ldl -lm -qopenmp -L/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4/lib -lnetcdff -lnetcdf -L -lpnetcdf
/opt/apps/gcc/6.3.0/bin/ld: cannot find -lhdf5_hl /opt/apps/gcc/6.3.0/bin/ld: cannot find -lhdf5 gmake[1]: *** [nems] Error 1

kgerheiser commented 4 years ago

You need to update your FindNetCDF.

See Dusan's PR to make ufs-weather-model work (and his modifications to FindNetCDF.cmake): https://github.com/ufs-community/ufs-weather-model/pull/193/files

aerorahul commented 4 years ago

@kgerheiser The S2S model does not use CMake, so the PR from @DusanJovic-NOAA is of no help here. The bug is in how the S2S model GNUMake sets the NetCDF include and lib paths. I believe @binli2337 has a fix/solution.

kgerheiser commented 4 years ago

Oh, ok. I didn't look super closely and thought it was the same problem I had yesterday.

aerorahul commented 4 years ago

@benjamin-cash I don't know how the S2S model builds on Stampede. Does it use the file conf/configure.fv3.cheyenne.intel? If so, replace lines 78-84 below:

INCLUDE = -I$(NETCDF_ROOT)/include
NETCDF_INC = -I$(NETCDF_ROOT)/include
ifneq ($(findstring netcdf/4,$(LOADEDMODULES)),)
  NETCDF_LIB += -L$(NETCDF)/lib -lnetcdff -lnetcdf
else
  NETCDF_LIB = -L$(NETCDF)/lib -lnetcdff -lnetcdf
endif

with

INCLUDE = -I$(nc-config --includedir)
NETCDF_INC = $(INCLUDE)
NETCDF_LIB = $(nc-config --libs)
NETCDF_LIB += $(nc-config --flibs)
benjamin-cash commented 4 years ago

@aerorahul It uses a file based on conf/configure.fv3.hera.intel. I found those lines and made the changes, testing now.

benjamin-cash commented 4 years ago

That failed with a bunch of cannot open include file 'netcdf.inc' errors in ../drifters/drifters_io.F90. It also looks like ./comp_ice.backend.libcice has my old esmf.mk path, so probably that needs to be changed as well?

benjamin-cash commented 4 years ago

There are also errors like this: mpiicc -Duse_libMPI -Duse_netCDF -DSPMD -DUSE_LOG_DIAG_FIELD_INFO -DUSE_GFSL63 -DGFS_PHYS -Duse_WRTCOMP -DNEW_TAUCTMAX -DINTERNAL_FILE_NML -DNO_INLINE_POST -DMOIST_CAPPA -DUSE_COND -DOPENMP -DCCPP -I -xCORE-AVX2 -qno-opt-dynamic-align -D__IFC -sox -fp-model source -O2 -debug minimal -qopenmp -I/work/02441/bcash/stampede2/s2s_p5_port/FV3/ccpp/include -c ../mpp/nsclock.c -o ../mpp/nsclock.o ../mosaic/read_mosaic.c(27): catastrophic error: cannot open source file "netcdf.h"

include

                 ^

compilation aborted for ../mosaic/read_mosaic.c (code 4)

mpiifort -Duse_libMPI -Duse_netCDF -DSPMD -DUSE_LOG_DIAG_FIELD_INFO -DUSE_GFSL63 -DGFS_PHYS -Duse_WRTCOMP -DNEW_TAUCTMAX -DINTERNAL_FILE_NML -DNO_INLINE_POST -DMOIST_CAPPA -DUSE_COND -DOPENMP -DCCPP -fpp -Wp,-w -I -I -fno-alias -auto -safe-cray-ptr -save-temps -ftz -assume byterecl -nowarn -sox -align array64byte -i4 -real-size 64 -no-prec-div -no-prec-sqrt -xCORE-AVX2 -qno-opt-dynamic-align -O2 -debug minimal -fp-model source -qoverride-limits -qopt-prefetch=3 -qopenmp -I/work/02441/bcash/stampede2/s2s_p5_port/FV3/ccpp/include -I../include -I../mpp/include -I../fms -c ../block_control/block_control.F90 -o ../block_control/block_control.o mpp_io.F90(371): #error: can't find include file: netcdf.inc

aerorahul commented 4 years ago

seems like include paths are missing.
Can you post the output of:

nc-config --includedir
nc-config --libs
nc-config --flibs

here?

benjamin-cash commented 4 years ago

login4.stampede2(987)$ nc-config --includedir /work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4/include login4.stampede2(988)$ nc-config --libs -L/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4/lib -lnetcdf login4.stampede2(989)$ nc-config --flibs -L/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4/lib -lnetcdff

aerorahul commented 4 years ago

Not entirely what I was expecting. And nc-config --all

benjamin-cash commented 4 years ago

Looking back through the build log, it looks like at some point the new modules got purged and replaced with the older setup. I'm going to break for dinner and then I will keep looking to see where that might have happened.

aerorahul commented 4 years ago

Likewise. Happy to help you out tomorrow. Initiate a call on slack.

benjamin-cash commented 4 years ago

login4.stampede2(1009)$ nc-config --all

This netCDF 4.7.4 has been built with the following features:

--cc -> mpiicc --cflags -> -I/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4/include -I/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/hdf5/1.10.6/include --libs -> -L/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4/lib -lnetcdf --static -> -lm

--has-c++ -> no --cxx ->

--has-c++4 -> yes --cxx4 -> mpiicpc --cxx4flags -> -I/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4/include -I/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/hdf5/1.10.6/include --cxx4libs -> -L/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4/lib -lnetcdf_c++4 -lnetcdf

--has-fortran -> yes --fc -> mpiifort --fflags -> -I/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4/include -I/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4/include --flibs -> -L/work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4/lib -lnetcdff --has-f90 -> --has-f03 -> yes

--has-dap -> no --has-dap2 -> no --has-dap4 -> no --has-nc2 -> yes --has-nc4 -> yes --has-hdf5 -> yes --has-hdf4 -> no --has-logging -> no --has-pnetcdf -> no --has-szlib -> no --has-cdf5 -> yes --has-parallel4 -> yes --has-parallel -> yes

--prefix -> /work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4 --includedir -> /work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4/include --libdir -> /work/02441/bcash/stampede2/intel-18.0.2/impi-18.0.2/netcdf/4.7.4/lib --version -> netCDF 4.7.4

aerorahul commented 4 years ago

From the looks of it, it appears to be a share library.

Could I ask how you built the stack? What config.sh and what stack.yaml did you use?

MinsukJi-NOAA commented 4 years ago

@aerorahul, are -lhdf5 and -lnetcdf needed separately? Or does it depend on the version of netcdf?

aerorahul commented 4 years ago

NETCDF builds with HDF5. It is needed.

aerorahul commented 4 years ago

I am also realizing that these flags are probably not set properly in the ESMF linker. I am certain this is not a stack issue, but rather a S2S model issue.

MinsukJi-NOAA commented 4 years ago

Do you know if ESMF_F90LINKPATHS is defined from hpc-stack?

aerorahul commented 4 years ago

The modulefile for ESMF defines ESMFMKFILE. It does not explicitly define any specific PATHS.

arunchawla-NOAA commented 4 years ago

Looks like we need to update the GNU build system to work with hpc-stack @binli2337 did you not have branch that works ? Maybe we need a PR back to the S2S model repo

MinsukJi-NOAA commented 4 years ago

I tried adding HDF5_LIBRARIES: NETCDF_LIB += -L$(NETCDF)/lib -lnetcdff -lnetcdf -L$(HDF5_LIBRARIES) in configure.*, but it instead gave me g2 errors:

8970 /work/07738/kgerheis/stampede2/hpc-stack/v1.0.0-beta1/intel-18.0.2/g2/3.4.1/lib/libg2_4.a(enc_jpeg2000.c.o): In function `enc_jpeg2000_':
8971 /work/07738/kgerheis/stampede2/hpc-stack/src/hpc-stack/pkg/g2-v3.4.1/src/enc_jpeg2000.c:167: undefined reference to `jas_stream_memopen'
8972 /work/07738/kgerheis/stampede2/hpc-stack/src/hpc-stack/pkg/g2-v3.4.1/src/enc_jpeg2000.c:174: undefined reference to `jas_stream_memopen'
8973 /work/07738/kgerheis/stampede2/hpc-stack/src/hpc-stack/pkg/g2-v3.4.1/src/enc_jpeg2000.c:179: undefined reference to `jpc_encode'
8974 /work/07738/kgerheis/stampede2/hpc-stack/src/hpc-stack/pkg/g2-v3.4.1/src/enc_jpeg2000.c:188: undefined reference to `jas_stream_close'
8975 /work/07738/kgerheis/stampede2/hpc-stack/src/hpc-stack/pkg/g2-v3.4.1/src/enc_jpeg2000.c:189: undefined reference to `jas_stream_close'
aerorahul commented 4 years ago

@MinsukJi-NOAA Look at these differences from the branch hpc3 from @binli2337.

It doesn't address these g2 errors though.

Looking at the configure.fv3.<platform> files, I suspect more variables used in the S2S GNUMake may be missing. Can you check if your NCEPLIBS variable is setting the JASPER_LIB and Z_LIB paths correctly. The modulefiles don't set the variables JASPER_LIB or Z_LIB. You will most likely need to construct that based on JASPER_ROOT and ZLIB_ROOT. Other hidden/missing variables may include PNG_LIB.

MinsukJi-NOAA commented 4 years ago

@aerorahul @binli2337, modifying EXTLIBS = $(NCEPLIBS) $(ESMF_F90LINKPATHS) $(ESMF_LIB) $(LDFLAGS) $(NETCDF_LIB) in cofigure.fv3.* did not help.

MinsukJi-NOAA commented 4 years ago

@aerorahul, I do see undefined JASPER_LIB being used in NCEPLIBS. I will try $JASPER_ROOT/lib64/libjasper.a. Regarding, -lhdf5 and -lhdf5_hl, it's still not clear how it can be fixed.

MinsukJi-NOAA commented 4 years ago

@aerorahul, I do see undefined JASPER_LIB being used in NCEPLIBS. I will try $JASPER_ROOT/lib64/libjasper.a. Regarding, -lhdf5 and -lhdf5_hl, it's still not clear how it can be fixed.

This is how it's currently done. NCEPLIBS = $(POST_LIB) $(NEMSIO_LIB) $(G2_LIB4) $(G2TMPL_LIB) $(BACIO_LIB4) $(SP_LIBd) $(W3EMC_LIBd) $(W3NCO_LIBd) $(CRTM_LIB) $(JASPER_LIB) -lpng -lz

aerorahul commented 4 years ago

@aerorahul, I do see undefined JASPER_LIB being used in NCEPLIBS. I will try $JASPER_ROOT/lib64/libjasper.a. Regarding, -lhdf5 and -lhdf5_hl, it's still not clear how it can be fixed.

This is how it's currently done. NCEPLIBS = $(POST_LIB) $(NEMSIO_LIB) $(G2_LIB4) $(G2TMPL_LIB) $(BACIO_LIB4) $(SP_LIBd) $(W3EMC_LIBd) $(W3NCO_LIBd) $(CRTM_LIB) $(JASPER_LIB) -lpng -lz

The -lpng -lz need paths to PNG_ROOT/lib64 and ZLIB_ROOT/lib if they are not in system locations.

What does the variable EMSF_F90LINKPATHS report?

MinsukJi-NOAA commented 4 years ago

The -lpng -lz need paths to PNG_ROOT/lib64 and ZLIB_ROOT/lib if they are not in system locations.

What does the variable EMSF_F90LINKPATHS report?

I am trying to figure out where EMSF_F90LINKPATHS is set.

aerorahul commented 4 years ago

@MinsukJi-NOAA Look at one of the older modulefiles for ESMF. It is possible that modulefile is setting that variable. That information is in ESMFMKFILE. Several places "include" this file, but I am not sure how the variables are set.

aerorahul commented 4 years ago

These variables are set in a file ESMFMKFILE=esmf.mk

aerorahul commented 3 years ago

@benjamin-cash Is this issue still relevant?

benjamin-cash commented 3 years ago

No, I have just borrowed the functioning setup Minsuk has on stampede so it isn't holding me up. I don't know if/how the issue hpc-stack not working with s2s was resolved, but I can go ahead and close this specific issue.

MinsukJi-NOAA commented 3 years ago

No, I have just borrowed the functioning setup Minsuk has on stampede so it isn't holding me up. I don't know if/how the issue hpc-stack not working with s2s was resolved, but I can go ahead and close this specific issue.

This issue was solved by using the approach taken here: https://github.com/ufs-community/ufs-s2s-model/pull/191 Files related to porting to stampede 2 will be checked into the S2S repo: https://github.com/ufs-community/ufs-s2s-model/pull/213