husseinaluie / FlowSieve

FlowSieve coarse-graining code base
https://flowsieve.readthedocs.io/en/latest/
Other
18 stars 9 forks source link

Cannot find -lhdf5 #22

Closed SalahKouhen closed 1 year ago

SalahKouhen commented 1 year ago

Hi Ben,

I'm still trying 😄. I recieved some help from the university IT team and now parallel is enabled (https://github.com/husseinaluie/FlowSieve/issues/20). I have the output of nc-config --all at the bottom in case it is helpful.

Unfortunately, I now run into this error:

... icpc: command line warning #10148: option '-Wdate-time' not supported ld: cannot find -lhdf5_hl ld: cannot find -lhdf5 Makefile:208: recipe for target 'Case_Files/coarse_grain.x' failed make: *** [Case_Files/coarse_grain.x] Error 1

I did:

make clean

module load intel-compilers/2022 module load openmpi/4.1.4-intel module load hdf5/1.12.2-intel-parallel module load netcdf/netcdf-c-4.9.0-parallel

make Case_Files/coarse_grain.x

Any help would be deeply appreciated!

All the best, Salah

This netCDF 4.9.0 has been built with the following features:

  --cc            -> mpicc
  --cflags        -> -I/network/software/ubuntu_bionic/netcdf/netcdf-c-4.9.0-parallel/include -I/network/software/ubuntu_bionic/hdf5/1.12.2-intel-parallel/include
  --libs          -> -L/network/software/ubuntu_bionic/netcdf/netcdf-c-4.9.0-parallel/lib -L/network/software/ubuntu_bionic/hdf5/1.12.2-intel-parallel/lib -lnetcdf -lhdf5_hl -lhdf5 -lm -lz -lsz -lbz2 -lxml2 -lcurl
  --static        -> -lhdf5_hl -lhdf5 -lm -lz -lsz -lbz2 -lxml2 -lcurl

  --has-c++       -> no
  --cxx           ->

  --has-c++4      -> yes
  --cxx4          -> g++
  --cxx4flags     -> -I/usr/include -Wdate-time -D_FORTIFY_SOURCE=2
  --cxx4libs      -> -L/usr/lib/x86_64-linux-gnu -lnetcdf_c++4 -lnetcdf

  --has-fortran   -> yes
  --fc            -> gfortran
  --fflags        -> -I/usr/include
  --flibs         -> -L/usr/lib -lnetcdff -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now -lnetcdf -lnetcdf
  --has-f90       -> no
  --has-f03       -> yes

  --has-dap       -> yes
  --has-dap2      -> yes
  --has-dap4      -> yes
  --has-nc2       -> yes
  --has-nc4       -> yes
  --has-hdf5      -> yes
  --has-hdf4      -> no
  --has-logging   -> no
  --has-pnetcdf   -> no
  --has-szlib     -> yes
  --has-cdf5      -> yes
  --has-parallel4 -> yes
  --has-parallel  -> yes
  --has-nczarr    -> yes
  --has-zstd      -> no
  --has-benchmarks -> no

  --prefix        -> /network/software/ubuntu_bionic/netcdf/netcdf-c-4.9.0-parallel
  --includedir    -> /network/software/ubuntu_bionic/netcdf/netcdf-c-4.9.0-parallel/include
  --libdir        -> /network/software/ubuntu_bionic/netcdf/netcdf-c-4.9.0-parallel/lib
  --version       -> netCDF 4.9.0
bastorer commented 1 year ago

Hi Salah,

Okay! With parallel netcdf enabled, we should be able to get this working!

Could you attach the system.mk file that you're using too?

Thanks!

bastorer commented 1 year ago

My suspicion is that we'll need to add -I/network/software/ubuntu_bionic/netcdf/netcdf-c-4.9.0-parallel/include -I/network/software/ubuntu_bionic/hdf5/1.12.2-intel-parallel/include to INC_DIR and -L/network/software/ubuntu_bionic/netcdf/netcdf-c-4.9.0-parallel/lib -L/network/software/ubuntu_bionic/hdf5/1.12.2-intel-parallel/lib to LIB_DIR, as well as listing all of -lhdf5_hl -lhdf5 -lm -lz -lsz -lbz2 -lxml2 -lcurl in LINKS

SalahKouhen commented 1 year ago

Hi Ben,

Thanks for the advice! I feel it is almost there.

This is the system.mk after I tried your suggestions (I probably did something wrong as I get issues running the basic tutorial):

# The following modules were last used
#  openmpi/2.0.1/b1  
#  hdf5/1.8.19/b1     
#  netcdf/4.3.3.1
#  gcc/8.2.0/b1
#  fftw3/3.3.6/b1

# Specify compilers
CXX     ?= g++
MPICXX  ?= mpicxx

# Linking flags for netcdf
LINKS:=-lnetcdf -lhdf5_hl -lhdf5 -lz -lcurl -fopenmp -lm -lsz -lbz2 -lxml2

# Default compiler flags
CFLAGS:=-Wall -std=c++14

# Debug flags
DEBUG_FLAGS:=-g
DEBUG_LDFLAGS:=-g

# Basic optimization flags
OPT_FLAGS:=-O3

# Extra optimization flags (intel inter-process optimizations)
EXTRA_OPT_FLAGS:=

# Specify optimization flags for ALGLIB
ALGLIB_OPT_FLAGS:=-O3

# Modules are automatically on lib dir
NETCDF_LIBS=`nc-config --cxx4libs` -L/network/software/ubuntu_bionic/netcdf/netcdf-c-4.9.0-parallel/lib -L/network/software/ubuntu_bionic/hdf5/1.12.2-intel-parallel/lib
NETCDF_INCS=`nc-config --cxx4flags` -I/network/software/ubuntu_bionic/netcdf/netcdf-c-4.9.0-parallel/include -I/network/software/ubuntu_bionic/hdf5/1.12.2-intel-parallel/include

LIB_DIRS:=${NETCDF_LIBS}
INC_DIRS:=${NETCDF_INCS}

When I compiled using this the output was: compileOut.txt

The issues were:

icpc: command line warning #10148: option '-Wdate-time' not supported
ld: warning: libmpi.so.40, needed by /network/software/ubuntu_bionic/hdf5/1.12.2-intel-parallel/lib/libhdf5_hl.so, may conflict with libmpi.so.20

I then generated the data in the basic tutorial and compiled with the recommended constants. When I ran ./coarse_grain.x --input_file velocity_sample.nc --filter_scales "1e3 15e3 50e3 100e3" I got:

 Commandline flag "--input_file" got value "velocity_sample.nc"
 Commandline flag "--time" received no value - will use default "time"
 Commandline flag "--depth" received no value - will use default "depth"
 Commandline flag "--latitude" received no value - will use default "latitude"
 Commandline flag "--longitude" received no value - will use default "longitude"
 Commandline flag "--is_degrees" received no value - will use default "true"
 Commandline flag "--Nprocs_in_time" received no value - will use default "1"
 Commandline flag "--Nprocs_in_depth" received no value - will use default "1"
 Commandline flag "--zonal_vel" received no value - will use default "uo"
 Commandline flag "--merid_vel" received no value - will use default "vo"
 Commandline flag "--region_definitions_file" received no value - will use default "region_definitions.nc"
 Commandline flag "--region_definitions_dim" received no value - will use default "region"
 Commandline flag "--region_definitions_var" received no value - will use default "region_definition"
 Commandline flag "--filter_scales" got value "1e3 15e3 50e3 100e3"
Filter scales (4) are:  1km,  15km,  50km,  100km,

Compiled at 09:25:20 on Jan 19 2023.
  Version 3.1.1

Using Cartesian coordinates.
coarse_grain.x: NETCDF_IO/read_var_from_file.cpp:85: void read_var_from_file(std::vector<double, std::allocator<double>> &, const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> &, const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> &, std::vector<bool, std::allocator<bool>> *, std::vector<int, std::allocator<int>> *, std::vector<int, std::allocator<int>> *, int, int, bool, int, double, ompi_communicator_t *): Assertion `input_nc_format == NC_FORMAT_NETCDF4' failed.
[atmlxint2:39184] *** Process received signal ***
[atmlxint2:39184] Signal: Aborted (6)
[atmlxint2:39184] Signal code:  (-6)
[atmlxint2:39184] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x12980)[0x7f4638f9d980]
[atmlxint2:39184] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7f4638bd8e87]
[atmlxint2:39184] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7f4638bda7f1]
[atmlxint2:39184] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x303fa)[0x7f4638bca3fa]
[atmlxint2:39184] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x30472)[0x7f4638bca472]
[atmlxint2:39184] [ 5] ./coarse_grain.x[0x421e9f]
[atmlxint2:39184] [ 6] ./coarse_grain.x[0x46266b]
[atmlxint2:39184] [ 7] ./coarse_grain.x[0x4b679a]
[atmlxint2:39184] [ 8] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f4638bbbc87]
[atmlxint2:39184] [ 9] ./coarse_grain.x[0x4051aa]
[atmlxint2:39184] *** End of error message ***
Aborted (core dumped)

So something is still not right! Once again, thanks for your help so far.

All the best, Salah

bastorer commented 1 year ago

It compiles! Progress!

This error (Assertioninput_nc_format == NC_FORMAT_NETCDF4' failed.`) is surprising for the Tutorial. It comes up sometimes with various model data that provides outputs using older netcdf formats. We make the input file ourselves in the tutorial though, and explicitly tell it to use netcdf4.

Can you call ncdump -k velocity_sample.nc? It should return netCDF-4.

Also, while we're at it, in python, could you call import netCDF4; netCDF4.__version__?

SalahKouhen commented 1 year ago

ncdump -k velocity_sample.nc returned netCDF-4

The netcdf version in my python environment is '1.6.0'.

I conda updated netcdf4 but nothing changed.

Salah

bastorer commented 1 year ago

Hmm, curious.

Since you're using conda, there's an environment.yml file in the main Tutorial directory ( it's a recent addition, so you might need to git pull ). Could you try running the basic tutorial using a conda environment built from that?

SalahKouhen commented 1 year ago

Hi Ben,

Still no luck:

conda env create -f environment.yml python generate_data.py ./coarse_grain.x --input_file velocity_sample.nc --filter_scales "1e3 15e3 50e3 100e3"

Assertion input_nc_format == NC_FORMAT_NETCDF4 failed.

bastorer commented 1 year ago

Hi Salah,

I just sent an email to your physics.ox.ac.uk address with the velocity_sample.nc data file that I get when I run the tutorial. Out of paranoia / to try and narrow down where things are going awry, can you try running the tutorial with that file?

bastorer commented 1 year ago

We now have a working running version of FlowSieve on Jasmin! Closing the issue :-)

In case anyone digs through this in the future trying to solve a similar problem, the solution was to remove the nc-config --cxx4libs and nc-config --cxx4flags parts from the NETCDF_LIBS and NETCDF_INCS variables.

I suspect the issue was related to the -lnetcdf link flag that was getting passed in too early in the compile list as a result.