apolukhin / Boost.DLL

Library for comfortable work with DLL and DSO
https://boost.org/libs/dll
109 stars 69 forks source link

`boost::dll::shared_library self(boost::dll::program_location());` causing issues on Linux #55

Open dennisklein opened 3 years ago

dennisklein commented 3 years ago

Dear developers,

we are using Boost.DLL for a small plugin system. Some plugins are linked into the executable following this guide: https://www.boost.org/doc/libs/1_76_0/doc/html/boost_dll/tutorial.html#boost_dll.tutorial.linking_plugin_into_the_executable.

The boost::dll::shared_library self(boost::dll::program_location()); statement however is causing two error classes for us (see FairRootGroup/FairMQ#351):

On a first glance, it looks like boost::dll::shared_library self(boost::dll::program_location()); is implemented with a dlopen("/path/to/executable", ...) on Linux. The Linux manpage also suggests to use dlopen(NULL, ...) for this case. But I am no expert here. Do you have any deeper insight in what might be the underlying issue for the problems described above? Your comments are very much appreciated!

apolukhin commented 3 years ago

Our tests do not sow such problem https://www.boost.org/development/tests/develop/developer/dll.html

Could you provide a minified example to reproduce the issue?

dennisklein commented 3 years ago

Could you provide a minified example to reproduce the issue?

See https://github.com/dennisklein/doublestaticinit for a small reproducer. It depends on CMake and singularity. If you do not have those deps, checker.cpp is the program that is run via gdb --batch --command=gdbchecker --args checker with this gdb script: gdbchecker

https://github.com/FairRootGroup/FairMQ/issues/351 shows a table on which systems we see the double static initialization.

On containerized CentOS 8 Continuous Integration environments we sometimes (20% chance or so) see errors like this: boost::dll::shared_library::load() failed (dlerror system message: /mnt/mesos/sandbox/sandbox/o2-fullci/sw/slc8_x86-64/O2/5964-local1/bin/o2-sim-primary-server-device-runner: cannot dynamically load executable): Bad file descriptor

For this issue we believe it could be related to ASLR and not having used PIEs. But we found a workaround. The more important issue to understand/solve would be the double static init.