HDFGroup / hdf5

Official HDF5® Library Repository
https://www.hdfgroup.org/
Other
623 stars 254 forks source link

Compiling user code with intel fortan compiler and "-init=snan" triggers a segmentation fault #3977

Closed pums974 closed 3 weeks ago

pums974 commented 9 months ago

Compiling a user fortran code with ifort or ifx using the debug option "-init=snan" is triggering a segmentation fault at runtime

Even a very simple program is triggering this:

    PROGRAM HELLO_HDF

        USE HDF5, only: h5open_f, h5close_f, h5get_libversion_f, h5eprint_f

        IMPLICIT NONE

        INTEGER :: error
        INTEGER :: majnum, minnum, relnum

        CALL h5open_f(error)
        IF (error /= 0) then
            call h5eprint_f(error)
            ERROR STOP 1
        end if

        CALL h5get_libversion_f(majnum, minnum, relnum, error)
        IF (error /= 0) then
            call h5eprint_f(error)
            ERROR STOP 1
        end if

        CALL h5close_f(error)
        IF (error /= 0) then
            call h5eprint_f(error)
            ERROR STOP 1
        end if

        WRITE (*, '(" HELLO_HDF is linked with HDF5 Library version ")', advance="NO")
        WRITE (*, '(I0)', advance="NO") majnum
        WRITE (*, '(".")', advance="NO")
        WRITE (*, '(I0)', advance="NO") minnum
        WRITE (*, '(" release ")', advance="NO")
        WRITE (*, '(I0)') relnum

    END PROGRAM HELLO_HDF
$ ifx --version
ifx (IFX) 2023.1.0 20230320
Copyright (C) 1985-2023 Intel Corporation. All rights reserved.

$ ifx -I/opt/local/hdf5/HDF5-1.14.3-Linux/HDF_Group/HDF5/1.14.3/mod/shared  -L/opt/local/hdf5/HDF5-1.14.3-Linux/HDF_Group/HDF5/1.14.3/lib/ -lhdf5_fortran 'hello_hdf.f90'

$ LD_LIBRARY_PATH=/opt/local/hdf5/HDF5-1.14.3-Linux/HDF_Group/HDF5/1.14.3/lib/:$LD_LIBRARY_PATH ./a.out 
 HELLO_HDF is linked with HDF5 Library version 1.14 release 3

$ ifx -init=snan -I/opt/local/hdf5/HDF5-1.14.3-Linux/HDF_Group/HDF5/1.14.3/mod/shared  -L/opt/local/hdf5/HD
F5-1.14.3-Linux/HDF_Group/HDF5/1.14.3/lib/ -lhdf5_fortran 'hello_hdf.f90'

$ LD_LIBRARY_PATH=/opt/local/hdf5/HDF5-1.14.3-Linux/HDF_Group/HDF5/1.14.3/lib/:$LD_LIBRARY_PATH ./a.out 
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source             
libc.so.6          00007F9D09891520  Unknown               Unknown  Unknown
a.out              000000000040CBE6  Unknown               Unknown  Unknown
a.out              0000000000409A4A  for__signal_handl     Unknown  Unknown
libc.so.6          00007F9D09891520  Unknown               Unknown  Unknown
libhdf5.so.310.3.  00007F9D096E9B89  H5T__init_native_     Unknown  Unknown
libhdf5.so.310.3.  00007F9D096201A6  H5T_init              Unknown  Unknown
libhdf5.so.310.3.  00007F9D097077E9  H5VL_init_phase2      Unknown  Unknown
libhdf5.so.310.3.  00007F9D0940526F  H5_init_library       Unknown  Unknown
libhdf5.so.310.3.  00007F9D0940641D  H5open                Unknown  Unknown
libhdf5_f90cstub.  00007F9D0981E5CF  h5init_types_c        Unknown  Unknown
libhdf5_fortran.s  00007F9D09F8607D  h5lib_mp_h5open_f     Unknown  Unknown
a.out              000000000040A242  Unknown               Unknown  Unknown
a.out              000000000040A1FD  Unknown               Unknown  Unknown
libc.so.6          00007F9D09878D90  Unknown               Unknown  Unknown
libc.so.6          00007F9D09878E40  __libc_start_main     Unknown  Unknown
a.out              000000000040A115  Unknown               Unknown  Unknown

$ ifort --version
ifort (IFORT) 2021.9.0 20230302
Copyright (C) 1985-2023 Intel Corporation.  All rights reserved.

$ ifort  -I/opt/local/hdf5/HDF5-1.14.3-Linux/HDF_Group/HDF5/1.14.3/mod/shared  -L/opt/local/hdf5/HDF5-1.14.
3-Linux/HDF_Group/HDF5/1.14.3/lib/ -lhdf5_fortran 'hello_hdf.f90'

$ LD_LIBRARY_PATH=/opt/local/hdf5/HDF5-1.14.3-Linux/HDF_Group/HDF5/1.14.3/lib/:$LD_LIBRARY_PATH ./a.out 
 HELLO_HDF is linked with HDF5 Library version 1.14 release 3

$ ifort -init=snan -I/opt/local/hdf5/HDF5-1.14.3-Linux/HDF_Group/HDF5/1.14.3/mod/shared  -L/opt/local/hdf5/
HDF5-1.14.3-Linux/HDF_Group/HDF5/1.14.3/lib/ -lhdf5_fortran 'hello_hdf.f90'

$ LD_LIBRARY_PATH=/opt/local/hdf5/HDF5-1.14.3-Linux/HDF_Group/HDF5/1.14.3/lib/:$LD_LIBRARY_PATH ./a.out 
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source             
libc.so.6          00007F7AC364C520  Unknown               Unknown  Unknown
a.out              000000000040CBB6  Unknown               Unknown  Unknown
a.out              0000000000409A0A  for__signal_handl     Unknown  Unknown
libc.so.6          00007F7AC364C520  Unknown               Unknown  Unknown
libhdf5.so.310.3.  00007F7AC30BAB89  H5T__init_native_     Unknown  Unknown
libhdf5.so.310.3.  00007F7AC2FF11A6  H5T_init              Unknown  Unknown
libhdf5.so.310.3.  00007F7AC30D87E9  H5VL_init_phase2      Unknown  Unknown
libhdf5.so.310.3.  00007F7AC2DD626F  H5_init_library       Unknown  Unknown
libhdf5.so.310.3.  00007F7AC2DD741D  H5open                Unknown  Unknown
libhdf5_f90cstub.  00007F7AC35D95CF  h5init_types_c        Unknown  Unknown
libhdf5_fortran.s  00007F7AC395707D  h5lib_mp_h5open_f     Unknown  Unknown
a.out              000000000040A21B  Unknown               Unknown  Unknown
a.out              000000000040A1BD  Unknown               Unknown  Unknown
libc.so.6          00007F7AC3633D90  Unknown               Unknown  Unknown
libc.so.6          00007F7AC3633E40  __libc_start_main     Unknown  Unknown
a.out              000000000040A0D5  Unknown               Unknown  Unknown

I'm not experiencing this with hdf5 version 1.14.0 My OS is : Ubuntu 22.04.2 LTS

derobins commented 7 months ago

This is due to moving our floating-point type introspection to a run-time check instead of a compile-time check (to better support cross-compiling). After the 1.14.3 release, we added code that disables floating-point exceptions while we introspect the types.

Can you check if this behavior persists in the current develop branch?

nncarlson commented 7 months ago

We've hit the same problem with our Fortran code when exploring upgrading to 1.14 from 1.12. With Intel ifort and ifx "-init=snan" causes the problem as the OP reported. But more generally it happens if floating point exception trapping is enabled with any Fortran compiler: via the -fpe0 ifort and ifx flag, or the -ieee=nonstd option with the NAG compiler (which is the default!), or the -ffpe-trap=invalid gfortran flag. We've tentatively had to disable FP exception trapping, which makes us very uncomfortable. I hope this is a (very) temporary issue with HDF5.

derobins commented 7 months ago

We've hit the same problem with our Fortran code when exploring upgrading to 1.14 from 1.12. With Intel ifort and ifx "-init=snan" causes the problem as the OP reported. But more generally it happens if floating point exception trapping is enabled with any Fortran compiler: via the -fpe0 ifort and ifx flag, or the -ieee=nonstd option with the NAG compiler (which is the default!), or the -ffpe-trap=invalid gfortran flag. We've tentatively had to disable FP exception trapping, which makes us very uncomfortable. I hope this is a (very) temporary issue with HDF5.

Have you tried the latest develop or hdf5_1_14 branches? Those should have a fix for this.

derobins commented 6 months ago

Is this fixed w/ 1.14.4?

nncarlson commented 6 months ago

I tested with the 1.14.4-2 tar file and it appears to be fixed. I'm not sure why the 1.14.5 milestone was mentioned.