NOAA-EMC / NCEPLIBS-ip

Fortran 90 subprograms to be used for interpolating between nearly all grids used at NCEP.
Other
5 stars 9 forks source link

SEGFAULT in bicubic_interp_mod.F90 in grib_util test on Intel compilers only #205

Open edwardhartnett opened 1 year ago

edwardhartnett commented 1 year ago

This is happening for intel compilers only.

However, the grb_util utilities are riddled with memory bugs, except degrib2, which I've cleaned up. The others have never been able to pass any kind of memory checking.

However, no package should segfault if it can check parameters and return an error instead, so this error should be reduced to a unit test in ip, and then we can decide how ip should handle it instead of segfaulting.

Meanwhile, I will take a look at the memory problems in copygb2 and see if I can fix some of them...

 *** Running copygb2 test
4: + ../src/copygb2/copygb2 -x data/ref_gdaswave.t00z.wcoast.0p16.f000.grib2 test_gdaswave_2.grib2
4: + ../src/copygb2/copygb2 -g '30 6 0 0 0 0 0 0 1473 1025 12190000 226541000 8 25000000 265000000 5079000 5079000 0 64 25000000 25000000' '-i1 1' -x data/ref_gdaswave.t00z.wcoast.0p16.f000.grib2 test_gdaswave_2.ip.grib2
4: forrtl: severe (174): SIGSEGV, segmentation fault occurred
4: Image              PC                Routine            Line        Source             
4: libc.so.6          00007F3E0E242520  Unknown               Unknown  Unknown
4: copygb2            000000000045EFF9  bicubic_interp_mo         127  bicubic_interp_mod.F90
4: copygb2            0000000000453AFB  ipolates_grib1_si          84  ipolates.F90
4: copygb2            000000000040B42F  Unknown               Unknown  Unknown
4: copygb2            000000000040BD26  Unknown               Unknown  Unknown
4: copygb2            000000000040E872  Unknown               Unknown  Unknown
4: copygb2            0000000000417179  Unknown               Unknown  Unknown
4: copygb2            000000000040886B  Unknown               Unknown  Unknown
4: copygb2            000000000040750D  Unknown               Unknown  Unknown
4: libc.so.6          00007F3E0E229D90  Unknown               Unknown  Unknown
4: libc.so.6          00007F3E0E229E40  __libc_start_main     Unknown  Unknown
4: copygb2            0000000000407425  Unknown               Unknown  Unknown
4/7 Test #4: run_copygb2_tests.sh .............***Failed    0.28 sec
AlexanderRichert-NOAA commented 1 year ago

I believe this is related to an issue I'd been seeing with ip unit tests segfaulting on my personal computer. Based on what I found here, I tried increasing the stack limit (ulimit -s unlimited) and that fixed it. I'm guessing this is why Kyle put the ulimit -s unlimitedline in ip's Intel CI. The same thing happens if I try to use the other interpolation schemes, i.e., they segfault unless I unset the stack limit. I'm still working on figuring out how much of it is a code issue per se vs. compiler issues.