Open mkpankov opened 6 years ago
I have a binary crate that links to hlua
and subsequently lua52-sys
and native liblua52-0.so
. Inside hlua
interpreter I load a Lua library lrex-posix
.
On my development machine, everything worked all the time w/o problems.
On my CI machine, when I try to load a Lua library inside of hlua
interpreter, it errors:
error loading module 'rex_posix' from file '/home/jenkins/ciri/lua/lib/lua/5.2/rex_posix.so':
/home/jenkins/ciri/lua/lib/lua/5.2/rex_posix.so: undefined symbol: lua_checkstack
I have the same Rust versions and cargo versions on both machines:
rustc 1.29.0 (aa3ca1994 2018-09-11)
cargo 1.29.0 (524a578d7 2018-08-05)
Dev machine is Ubuntu 16.04.
CI machine is a docker container running on Debian Stretch. Rust toolchain is installed via rustup
and moved to persistent volume, which is mounted at /opt/cargo
and inserted to workspace via
ln -sf /opt/cargo/cargo $HOME/.cargo'
ln -sf /opt/cargo/rustup $HOME/.rustup'
with subsequent source $HOME/.cargo/env
.
I've tried installing fresh toolchain after container bootup instead of using symlinks and it didn't influence the error.
This symbol is located in /usr/lib/x86_64-linux-gnu/liblua5.2.so.0
on my dev machine
readelf -s /usr/lib/x86_64-linux-gnu/liblua5.2.so.0
:
211: 0000000000006880 146 FUNC GLOBAL DEFAULT 13 lua_checkstack@@LUA_5.2
This library is properly NEEDED
in dynamic section of ELF on my dev machine:
readelf -d target/debug/ciri
:
0x0000000000000001 (NEEDED) Shared library: [liblua5.2.so.0]
This library isn't NEEDED
in dynamic section of ELF on my CI machine.
So when Lua interpreter loads the library, it calls dl_open
internally. I've been able to debug the dynamic loading and confirm the loader is unable to find the symbol because it doesn't even look for liblua5.2.so
.
On my dev machine loading the rex-posix
Lua library with LD_DEBUG=all LD_DEBUG_OUTPUT=ld-debug
9994: symbol=lua_checkstack; lookup in file=target/debug/ciri [0]
9994: symbol=lua_checkstack; lookup in file=/usr/lib/x86_64-linux-gnu/liblua5.2.so.0 [0]
9994: binding file /home/mkpankov/projects/ciri/lua/lib/lua/5.2/rex_posix.so [0] to /usr/lib/x86_64-linux-gnu/liblua5.2.so.0 [0]: normal symbol `lua_checkstack'
On my CI machine loading the rex-posix
Lua library
3966: symbol=lua_checkstack; lookup in file=target/debug/ciri [0]
3966: symbol=lua_checkstack; lookup in file=/lib/x86_64-linux-gnu/libdl.so.2 [0]
3966: symbol=lua_checkstack; lookup in file=/lib/x86_64-linux-gnu/librt.so.1 [0]
3966: symbol=lua_checkstack; lookup in file=/lib/x86_64-linux-gnu/libpthread.so.0 [0]
3966: symbol=lua_checkstack; lookup in file=/lib/x86_64-linux-gnu/libgcc_s.so.1 [0]
3966: symbol=lua_checkstack; lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
3966: symbol=lua_checkstack; lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
3966: symbol=lua_checkstack; lookup in file=/lib/x86_64-linux-gnu/libm.so.6 [0]
3966: symbol=lua_checkstack; lookup in file=/home/jenkins/ciri/lua/lib/lua/5.2/rex_posix.so [0]
3966: symbol=lua_checkstack; lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
3966: symbol=lua_checkstack; lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
3966: /home/jenkins/ciri/lua/lib/lua/5.2/rex_posix.so: error: symbol lookup error: undefined symbol: lua_checkstack (fatal)
Further debugging shows that in case Lua is compiled from source, compiled target/debug/liblua52_sys.rlib
does contain aforementioned lua_checkstack
symbol. In case system Lua is used via pkg-config
, rlib
doesn't contain lua_checkstack
symbol.
When using this library in program, resulting binary also contains lua_checkstack
symbol when Lua is compiled from source, and doesn't when Lua is found in the system.
Then, Lua, when loading the native library, uses dlopen
in loadlib.c:135
and for some reason, it misses the symbol it searches for, even though it's in the binary, global and visible.
➜ ciri git:(master) ✗ readelf -s target/debug/ciri | grep lua_checkstack
24374: 0000000000525166 247 FUNC GLOBAL DEFAULT 13 lua_checkstack
It seems as if when lua_checkstack
is not in binary, surrounding code during linking fixes that by requesting proper liblua5.2.so
. When it is in binary, linker thinks it will be found when dlopen
tries to resolve its' symbols, but that doesn't happen for some reason.
Steps to reproduce:
lrex_posix
Lua librarypkg-config
(a way to do that is removepkg-config
package or used patchedlua52-sys
which always builds from source).hlua
interpreterExpected result: library successfully loads Observed result:
Workaround: make sure your system has
liblua5.2
installed and findable bypkg-config
, so thatbuild.rs
finds this version - it won't trigger that bug, but requiresliblua5.2.so
shared library to be installed. So for example on Ubuntu 16.04 you'll need to have bothliblua5.2-0
(shared library) andliblua5.2-dev
(headers and static library) installed.What follows is pretty obscure debugging info I gathered while trying to fix this issue. I'll post it in separate comment.