OSGeo / gdal

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
https://gdal.org
Other
4.9k stars 2.55k forks source link

`gdalinfo test.tif` returns SIGABRT when compiled with ECW 5.4/5.5 support and `proj.db` installed #2394

Closed ggardet closed 2 years ago

ggardet commented 4 years ago

Expected behavior and actual behavior.

gdalinfo test.tif returns SIGABRT when compiled with ECW 5.4/5.5 support and proj.db installed. If I removed the proj.db file, there is no SIGABRT anymore.

gdb traces with ecw 5.4:

(gdb) run /home/guillaume/error_gdalinfo.tif
Starting program: /usr/bin/gdalinfo /home/guillaume/error_gdalinfo.tif
Missing separate debuginfos, use: zypper install gdal-debuginfo-3.0.4-121.9.x86_64
warning: the debug information found in "/usr/lib64/libNCSEcw.so.debug" does not match "/usr/lib64/libNCSEcw.so.5.4.0" (CRC mismatch).

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Driver: GTiff/GeoTIFF
Files: /home/guillaume/error_gdalinfo.tif
Size is 11228, 27199

Program received signal SIGABRT, Aborted.
0x00007ffff69faea1 in raise () from /lib64/libc.so.6
(gdb) where
#0  0x00007ffff69faea1 in raise () from /lib64/libc.so.6
#1  0x00007ffff69e453d in abort () from /lib64/libc.so.6
#2  0x00007ffff2a5836a in ?? () from /lib64/libgcc_s.so.1
#3  0x00007ffff2a687cd in __gcc_personality_v0 () from /lib64/libgcc_s.so.1
#4  0x00007ffff4ce0ed3 in ?? () from /usr/lib64/libNCSEcw.so.5.4.0
#5  0x00007ffff4ce158a in ?? () from /usr/lib64/libNCSEcw.so.5.4.0
#6  0x00007ffff4c475db in __cxa_throw () from /usr/lib64/libNCSEcw.so.5.4.0
#7  0x00007ffff52303b8 in osgeo::proj::operation::Conversion::_exportToPROJString (this=0x555555a3cca0, formatter=0x555555995200) at /usr/include/c++/9/ext/new_allocator.h:89
#8  0x00007ffff532ff2d in osgeo::proj::io::IPROJStringExportable::exportToPROJString[abi:cxx11](osgeo::proj::io::PROJStringFormatter*) const (this=0x555555a3cce8, formatter=0x555555995200) at iso19111/io.cpp:6262
#9  0x00007ffff52e1a23 in pj_obj_create (ctx=0x55555568fe20, objIn=...) at /usr/include/c++/9/bits/unique_ptr.h:360
#10 0x00007ffff52cfe06 in proj_create_conversion (ctx=0x55555568fe20, name=0x0, auth_name=0x0, code=0x0, method_name=0x0, method_auth_name=0x0, method_code=0x0, param_count=0, params=0x0) at /usr/include/c++/9/bits/shared_ptr_base.h:756
#11 0x00007ffff7aae3da in OGRSpatialReference::SetProjCS(char const*) () from /usr/lib64/libgdal.so.26
#12 0x00007ffff7880ecb in GTIFGetOGISDefnAsOSR () from /usr/lib64/libgdal.so.26
#13 0x00007ffff78bd433 in GTiffDataset::LookForProjection() () from /usr/lib64/libgdal.so.26
#14 0x00007ffff78adc73 in GTiffDataset::GetSpatialRef() const () from /usr/lib64/libgdal.so.26
#15 0x00007ffff79e62be in GDALInfo () from /usr/lib64/libgdal.so.26
#16 0x0000555555555357 in main ()

gdb traces with ecw 5.5:

error_gdalinfo.tif
Size is 11228, 27199

Program received signal SIGABRT, Aborted.
0x00007ffff69faea1 in raise () from /lib64/libc.so.6
(gdb) where
#0  0x00007ffff69faea1 in raise () from /lib64/libc.so.6
#1  0x00007ffff69e453d in abort () from /lib64/libc.so.6
#2  0x00007ffff29d136a in ?? () from /lib64/libgcc_s.so.1
#3  0x00007ffff29e17cd in __gcc_personality_v0 () from /lib64/libgcc_s.so.1
#4  0x00007ffff4d05833 in ?? () from /usr/lib64/libNCSEcw.so.5.5.0
#5  0x00007ffff4d05eea in ?? () from /usr/lib64/libNCSEcw.so.5.5.0
#6  0x00007ffff4c6c17b in __cxa_throw () from /usr/lib64/libNCSEcw.so.5.5.0
#7  0x00007ffff52303b8 in osgeo::proj::operation::Conversion::_exportToPROJString (this=0x555555a3c3d0, formatter=0x555555994990) at /usr/include/c++/9/ext/new_allocator.h:89
#8  0x00007ffff532ff2d in osgeo::proj::io::IPROJStringExportable::exportToPROJString[abi:cxx11](osgeo::proj::io::PROJStringFormatter*) const (this=0x555555a3c418, formatter=0x555555994990) at iso19111/io.cpp:6262
#9  0x00007ffff52e1a23 in pj_obj_create (ctx=0x55555568f590, objIn=...) at /usr/include/c++/9/bits/unique_ptr.h:360
#10 0x00007ffff52cfe06 in proj_create_conversion (ctx=0x55555568f590, name=0x0, auth_name=0x0, code=0x0, method_name=0x0, method_auth_name=0x0, method_code=0x0, param_count=0, params=0x0) at /usr/include/c++/9/bits/shared_ptr_base.h:756
#11 0x00007ffff7aafd6a in OGRSpatialReference::SetProjCS(char const*) () from /usr/lib64/libgdal.so.26
#12 0x00007ffff7889a1b in GTIFGetOGISDefnAsOSR () from /usr/lib64/libgdal.so.26
#13 0x00007ffff78ae533 in GTiffDataset::LookForProjection() () from /usr/lib64/libgdal.so.26
#14 0x00007ffff78b8a13 in GTiffDataset::GetSpatialRef() const () from /usr/lib64/libgdal.so.26
#15 0x00007ffff79e0f2e in GDALInfo () from /usr/lib64/libgdal.so.26
#16 0x0000555555555357 in main ()

Steps to reproduce the problem.

Compile gdal with proj7 and ECW SDK 5.4 or 5.5 (Desktop Read-only) and run gdalinfo test.tif when proj.dbis installed.

Operating system

openSUSE Leap 15.1 / openSUSE Tumbleweed x86_64

GDAL version and provenance

gdal 3.0.4 from openSUSE + ECW option enabled.

rouault commented 4 years ago

@christapley Any hint on this ? I cannot reproduce on Ubuntu 16.04 with ECW 5.5, but I do indeed see that libNCSEcw.so.5.5.0 exports a __cxa_throw symbol, which is quite suspect since this symbol is normally exported by libstdc++.so.6 . So it is obvious here there's a symbol clash, and that PROJ C++ exceptions are thrown through libNCSEcw instead of libstdc++ Actually, I see that libNCSEcw.so.5.5.0 doesn't link against libstdc++.so.6, so I suspect it statically links it, which is the likely cause of that issue.

ggardet commented 4 years ago

I forgot to add that I also have fgdb_api enabled. I am rebuilding without it atm to check if it has an influence. It does not change anything. So, we can continue to work on this.

akontsevich commented 4 years ago

Seems I have exact same or similar crash and stack trace on openSUSE Tumbleweed while it works fine on Ubuntu for my colleagues. See details here: https://stackoverflow.com/q/61164590/630169

Could it be some SUSE packages build problem: options switches or something? Or it is GDAL and required libraries versions problem?

akontsevich commented 4 years ago

Have solved my issue. Problem was in libproj version: https://stackoverflow.com/a/61198016/630169

rouault commented 4 years ago

For clarity, suppressing the 2 above posts from @akontsevich which aren't relevant to that ticket

akontsevich commented 4 years ago

They are relevant as I got same crashes and same call stack in openSUSE for gdal-3 linked app. So something is wrong in SUSE gdal packages builds.

christapley commented 4 years ago

Hi @rouault, I'm not working for Hexagon any longer but I have made sure that they are aware of this issue.

jbowman-hexagon commented 4 years ago

Hi @ggardet . Can you please supply your ./configure arguments for PROJ and GDAL, and a directory listing of your ECWJP2 SDK directory (recursive, or primarily, the contents of ./lib)

jbowman-hexagon commented 4 years ago

@christapley Any hint on this ? I cannot reproduce on Ubuntu 16.04 with ECW 5.5, but I do indeed see that libNCSEcw.so.5.5.0 exports a __cxa_throw symbol, which is quite suspect since this symbol is normally exported by libstdc++.so.6 . So it is obvious here there's a symbol clash, and that PROJ C++ exceptions are thrown through libNCSEcw instead of libstdc++ Actually, I see that libNCSEcw.so.5.5.0 doesn't link against libstdc++.so.6, so I suspect it statically links it, which is the likely cause of that issue.

@rouault Where do you see that it exports that symbol? Using 'nm -gD .../libNCSEcw.so.5.5' I see that __cxa_throw is an external symbol. Not sure that these conclusions are accurate.

rouault commented 4 years ago

__cxa_throw is definetely embedded in libNCSEcw.so

Compare

$ objdump -T ~/hexagon/ERDAS-ECW_JPEG_2000_SDK-5.5.0/Desktop_Read-Only/lib/cpp11abi/x64/release/libNCSEcw.so | grep __cxa_throw
000000000072bc80 g    DF .text  0000000000000068  Base        __cxa_throw

vs

$ objdump -T  libgdal.so | grep __cxa_throw
0000000000000000      DF *UND*  0000000000000000      

Or with nm

$ nm -gD ~/hexagon/ERDAS-ECW_JPEG_2000_SDK-5.5.0/Desktop_Read-Only/lib/cpp11abi/x64/release/libNCSEcw.so | grep __cxa_throw
000000000072bc80 T __cxa_throw

vs

$ nm -gD  libgdal.so | grep __cxa_throw
                 U __cxa_throw
crichaud-work commented 4 years ago

In ECW SDK 5.5. Update 2, if you use the command line nm --demangle libNCSEcw. a | grep __cxa_throw, then all the __cxa_throw are "undefined'. We did not find an easy way for the moment for the DYNAMIC library libNCSEcw.so to remove the static link to stdlibc++ that is needed due to a third party we use in another application. We used visibility -fhidden + -Wl,--exclude-libs,ALL but the symbols are still exported concerning the exception. We are still investigating. Please use the static library for the moment.

rouault commented 2 years ago

closing as not a GDAL bug