ecmwf / ecbuild

A CMake-based build system, consisting of a collection of CMake macros and functions that ease the managing of software build systems
https://ecbuild.readthedocs.io
Apache License 2.0
26 stars 25 forks source link

ecbuild "hidden symbol" error with intel #13

Closed mmiesch closed 1 year ago

mmiesch commented 4 years ago

Recent versions of ecbuild include the check_linker test which, according to the warning message, checks that the linker supports the $ORIGIN Rpath token expansion. If it does not, then it adds the following compiler flag:

-Wl,--allow-shlib-undefined

We are finding that this flag is causing errors with old versions of the gnu ld linker. Specifically, we've seen it with version 2.26.1, which is the latest version available with the apt package manager on ubuntu 16.04, and for version 2.23.1, which is installed on NASA's Discover HPC system. For these linkers, the check_linker test fails and the flags are added. But then the build fails when it tries to build its first executable with an error message like this:

cd /home/ubuntu/jedi/ufo-bundle/build/eckit/src/tools && /usr/local/bin/cmake -E cmake_link_script CMakeFiles/dhcopy.dir/link.txt --verbose=1
/opt/intel/compilers_and_libraries_2017.1.132/linux/mpi/intel64/bin/mpiicpc  -O2 -g -DNDEBUG      -Wl,--disable-new-dtags    -Wl,--allow-shlib-undefined -rdynamic CMakeFiles/dhcopy.dir/dhcopy.cc.o  -o eckit-dhcopy -Wl,-rpath,/home/ubuntu/jedi/ufo-bundle/build/lib ../../../lib/libeckit_option.so ../../../lib/libeckit.so /usr/lib/x86_64-linux-gnu/libssl.so /usr/lib/x86_64-linux-gnu/libcrypto.so /usr/lib/x86_64-linux-gnu/libcurl.so /usr/lib/x86_64-linux-gnu/librt.so -lm -ldl 
ld: eckit-dhcopy: hidden symbol `__intel_cpu_features_init_x' in /opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64_lin/libirc.a(cpu_feature_disp.o) is referenced by DSO
ld: final link failed: Bad value
eckit/src/tools/CMakeFiles/dhcopy.dir/build.make:90: recipe for target 'eckit/src/tools/eckit-dhcopy' failed
make[2]: *** [eckit/src/tools/eckit-dhcopy] Error 1
make[2]: Leaving directory '/home/ubuntu/jedi/ufo-bundle/build'
CMakeFiles/Makefile2:2357: recipe for target 'eckit/src/tools/CMakeFiles/dhcopy.dir/all' failed
make[1]: *** [eckit/src/tools/CMakeFiles/dhcopy.dir/all] Error 2
make[1]: Leaving directory '/home/ubuntu/jedi/ufo-bundle/build'
Makefile:162: recipe for target 'all' failed
make: *** [all] Error 2

It's only for the executable. So, the compilation and testing is successful if we comment out this line in ./share/ecbuild/cmake/ecbuild_check_os.cmake:

https://github.com/ecmwf/ecbuild/blob/12005bf103f57f81df18a11db6945647c06f5bd8/cmake/ecbuild_check_os.cmake#L300

We only see this problem when compiling with the intel compiler suite using gnu ld versions 2.26.1 or earlier. For example, everying works fine for our ubuntu 18.04 containers where we are running gnu ld version 2.30.0. So, I'm not sure this is really a bug worth fixing - it only shows up for relatively old linkers. We're interested in why this was added.

oiffrig commented 4 years ago

We, as well as some of our users, have been experiencing this too with recent Intel compilers. For now, the only workaround we have is the one you identified. You can prevent ecbuild from adding this option by setting -DECBUILD_DISABLE_RPATH_FIX=ON on the command line.

This flag is added only for old linkers (GNU ld prior to 2.28), because they do not expand $ORIGIN while linking, causing failures in our software due to seemingly missing symbols. See https://sourceware.org/bugzilla/show_bug.cgi?id=20535 for details. Our solution was to add the -Wl,--allow-shlib-undefined flag in that case, but we will have to improve on that.

mmiesch commented 4 years ago

Thanks @oiffrig - it's good to know about that command-line option. Please keep us posted if you make any changes.

oiffrig commented 4 years ago

Hi @mmiesch, I am trying to reproduce the bug, and managed to do so with some mixed Fortran/C code, but your error seems to come from eckit, which seems to build fine in my case. Could you give me more details about that? In particular, which eckit version and which options you used would be very helpful.

mmiesch commented 4 years ago

Thanks @oiffrig for taking another look at this. We're working from our own eckit fork that branches from upstream version 1.4.0. We really aren't using any particular options. eckit is typically built in Release mode with the only other option being the install prefix.

ecbuild -DCMAKE_INSTALL_PREFIX=$prefix --build=Release ..
make $verb -j${NTHREADS:-4}
$SUDO make install
wdeconinck commented 1 year ago

@mmiesch is this still an issue worth looking into?

climbfuji commented 1 year ago

@wdeconinck @mmiesch left JCSDA more than a year ago. We haven't seen this problem lately, presumably because the newer systems don't use gnu ld versions 2.26.1 or earlier. Therefore I suggest to close this issue.

wdeconinck commented 1 year ago

Thanks for letting us know :)