palma-ice / yelmo

Yelmo ice-sheet model code base
GNU General Public License v3.0
14 stars 3 forks source link

yelmo_benchmark causes segfault (gfortran, ubuntu linux) #1

Closed bueler closed 4 years ago

bueler commented 4 years ago

I have the following config file:

FC = gfortran

INC_NC  = -I/usr/include
LIB_NC  = -L/usr/lib -lnetcdff -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now -lnetcdf -lnetcdf -ldl -lz -lcurl -lm

LISROOT = /home/bueler/rarepos/lisbuild/
INC_LIS = -I${LISROOT}/include 
LIB_LIS = -L${LISROOT}/lib/ -llis

FFLAGS  = -ffree-line-length-none -I$(objdir) -J$(objdir)
LFLAGS  = $(LIB_NC) $(LIB_LIS) -Wl,-zmuldefs

DFLAGS_NODEBUG = -O2
DFLAGS_DEBUG   = -w -g -p -ggdb -ffpe-trap=invalid,zero,overflow,underflow -fbacktrace -fcheck=all
DFLAGS_PROFILE = -O2 -pg

Then I successfully ran

make clean
make yelmo-static
make benchmarks

Then I did

python run_yelmo.py -r -e benchmarks output/test par/yelmo_EISMINT.nml

The executable yelmo_benchmark.x ran for about two minutes and then seg-faulted with an empty backtrace:

$ python run_yelmo.py -r -e benchmarks output/test par/yelmo_EISMINT.nml 
Directory already exists: output/test
Warning: path does not exist ice_data
Running job in background: cd output/test && exec ./yelmo_benchmarks.x yelmo_EISMINT.nml > out.out &
$ 
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

What further info can I send to help you diagnose this error? Note that I do not know how to interpret the "path does not exist ice_data"; no reason to believe this causes the seg fault, however. Note that the file out.out is empty. Note the git revision is master at tag v0.972

Ed

alex-robinson commented 4 years ago

Dear Ed,

I have duplicated your error. It seems to come down to the fact that the specific parameter file par/yelmo_EISMINT.nml was not made consistent with some of the latest changes. I will make sure to do so in the next revision. In the meantime, I recommend only using parameter files from in the directory par/gmd, which have been validated for the current release. Yelmo is still under heavy development, so sometimes these inconsistent parameter files slip through.

Note that one way to get more info out of the backtrace is to compile with debugging flags on:

make clean
make benchmarks debug=1

In this case, it shows that the error was with the size of an array dt_save. This is also unfortunately not too informative, but helped me deduce that the adapative timestep being used was very small, which also may have been an issue here. This error has already been resolved on a non-release branch and will be fixed in the next iteration.

Cheers, Alex