husseinaluie / FlowSieve

FlowSieve coarse-graining code base
https://flowsieve.readthedocs.io/en/latest/
Other
18 stars 9 forks source link

Successfull compile but 'Assertion `input_nc_format == (3)' failed.' on JASMIN HPC #37

Open thomaswilder opened 6 months ago

thomaswilder commented 6 months ago

Hello,

This query is probably related with issues (#20 and #22). I don't think anyone ever got FlowSieve working on JASMIN?

I am testing the BASIC Tutorial but after running

mpirun ./coarse_grain.x --input_file velocity_sample.nc --filter_scales "1e3 15e3 50e3 100e3"

in a sbatch script, the error is as follows

coarse_grain.x: NETCDF_IO/read_var_from_file.cpp:85: void read_var_from_file(std::vector<double>&, const string&, const string&, std::vector<bool>*, std::vector<int>*, std::vector<int>*, int, int, bool, int, double, MPI_Comm): Assertion `input_nc_format == (3)' failed.
[host424:164200] *** Process received signal ***
[host424:164200] Signal: Aborted (6)
[host424:164200] Signal code:  (-6)
[host424:164200] [ 0] /lib64/libpthread.so.0(+0xf630)[0x7f0174ef3630]
[host424:164200] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x7f0174b4c387]
[host424:164200] [ 2] /lib64/libc.so.6(abort+0x148)[0x7f0174b4da78]
[host424:164200] [ 3] /lib64/libc.so.6(+0x2f1a6)[0x7f0174b451a6]
[host424:164200] [ 4] /lib64/libc.so.6(+0x2f252)[0x7f0174b45252]
[host424:164200] [ 5] ./coarse_grain.x[0x4aa8d3]
[host424:164200] [ 6] ./coarse_grain.x[0x4b3e1b]
[host424:164200] [ 7] ./coarse_grain.x[0x4a0828]
[host424:164200] [ 8] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f0174b38555]
[host424:164200] [ 9] ./coarse_grain.x[0x4a397e]
[host424:164200] *** End of error message ***
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 164200 on node host424 exited on signal 6 (Aborted).
--------------------------------------------------------------------------

What i have gathered from the previous issues is that --has-parallel -> should be yes. Running nc-config -all gives

nc-config --all

This netCDF 4.8.1 has been built with the following features: 

  --cc            -> x86_64-conda-linux-gnu-cc
  --cflags        -> -I/apps/jasmin/jaspy/mambaforge_envs/jaspy3.10/mf-22.11.1-4/envs/jaspy3.10-mf-22.11.1-4-r20230718/include
  --libs          -> -L/apps/jasmin/jaspy/mambaforge_envs/jaspy3.10/mf-22.11.1-4/envs/jaspy3.10-mf-22.11.1-4-r20230718/lib -lnetcdf
  --static        -> -lmfhdf -ldf -lhdf5_hl -lhdf5 -lm -lcurl -lzip

  --has-c++       -> no
  --cxx           -> 

  --has-c++4      -> yes
  --cxx4          -> /home/conda/feedstock_root/build_artifacts/netcdf-cxx4_1659035179945/_build_env/bin/x86_64-conda-linux-gnu-c++
  --cxx4flags     -> -I/apps/jasmin/jaspy/mambaforge_envs/jaspy3.10/mf-22.11.1-4/envs/jaspy3.10-mf-22.11.1-4-r20230718/include
  --cxx4libs      -> -L/apps/jasmin/jaspy/mambaforge_envs/jaspy3.10/mf-22.11.1-4/envs/jaspy3.10-mf-22.11.1-4-r20230718//apps/jasmin/jaspy/mambaforge_envs/jaspy3.10/mf-22.11.1-4/envs/jaspy3.10-mf-22.11.1-4-r20230718/lib -lnetcdf-cxx4 -lnetcdf

  --has-fortran   -> yes
  --fc            -> /home/conda/feedstock_root/build_artifacts/netcdf-fortran_1674656969142/_build_env/bin/x86_64-conda-linux-gnu-gfortran
  --fflags        -> -I/apps/jasmin/jaspy/mambaforge_envs/jaspy3.10/mf-22.11.1-4/envs/jaspy3.10-mf-22.11.1-4-r20230718/include -I/apps/jasmin/jaspy/mambaforge_envs/jaspy3.10/mf-22.11.1-4/envs/jaspy3.10-mf-22.11.1-4-r20230718/include
  --flibs         -> -L/apps/jasmin/jaspy/mambaforge_envs/jaspy3.10/mf-22.11.1-4/envs/jaspy3.10-mf-22.11.1-4-r20230718/lib -lnetcdff -lnetcdf -lnetcdf
  --has-f90       -> TRUE
  --has-f03       -> yes

  --has-dap       -> yes
  --has-dap2      -> yes
  --has-dap4      -> yes
  --has-nc2       -> yes
  --has-nc4       -> yes
  --has-hdf5      -> yes
  --has-hdf4      -> yes
  --has-logging   -> no
  --has-pnetcdf   -> no
  --has-szlib     -> no
  --has-cdf5      -> yes
  --has-parallel4 -> no
  --has-parallel  -> no
  --has-nczarr    -> yes

  --prefix        -> /apps/jasmin/jaspy/mambaforge_envs/jaspy3.10/mf-22.11.1-4/envs/jaspy3.10-mf-22.11.1-4-r20230718
  --includedir    -> /apps/jasmin/jaspy/mambaforge_envs/jaspy3.10/mf-22.11.1-4/envs/jaspy3.10-mf-22.11.1-4-r20230718/include
  --libdir        -> /apps/jasmin/jaspy/mambaforge_envs/jaspy3.10/mf-22.11.1-4/envs/jaspy3.10-mf-22.11.1-4-r20230718/lib
  --version       -> netCDF 4.8.1

which shows it does not have parallel-netcdf. It could be that JASMIN does not have a pre-compiled parallel version of netcdf installed, which causes this issue? I guess I want to rule out other issues first, e.g. with the jasmin.mk file

# Specify compilers
CXX    ?= g++
MPICXX ?= mpicxx

# Linking flags for netcdf
LINKS:= -lnetcdf-cxx4 -lnetcdf -lhdf5_hl -lhdf5 -lm -ldl -lz -fopenmp

# Default compiler flags
CFLAGS:=-Wall -std=c++14

# Debug flags
DEBUG_FLAGS:=-g
DEBUG_LDFLAGS:=-g

# Basic optimization flags
OPT_FLAGS:=-O3

# Extra optimization flags (intel inter-process optimizations)
EXTRA_OPT_FLAGS:=

# Specify optimization flags for ALGLIB
ALGLIB_OPT_FLAGS:=-O3

# Modules are automatically on lib dir
NETCDF_LIBS="-L/apps/sw/eb/software/netCDF/4.8.0-gompi-2021a/lib"
NETCDF_INCS="-I/apps/sw/eb/software/netCDF/4.8.0-gompi-2021a/include"

HDF5_LIBS="-L/apps/sw/eb/software/HDF5/1.12.1-gompi-2021b/lib"
HDF5_INCS="-I/apps/sw/eb/software/HDF5/1.12.1-gompi-2021b/include"

LIB_DIRS:=${NETCDF_LIBS} ${HDF5_LIBS}
INC_DIRS:=${NETCDF_INCS} ${HDF5_INCS}

Any advice appreciated.