tomaka / hlua

Rust library to interface with Lua
MIT License
507 stars 48 forks source link

Impossible to use native Lua libraries in case Lua is compiled from source in lua52-sys #200

Open mkpankov opened 6 years ago

mkpankov commented 6 years ago

Steps to reproduce:

  1. Install lrex_posix Lua library
  2. Make sure your system Lua isn't findable by pkg-config (a way to do that is remove pkg-config package or used patched lua52-sys which always builds from source).
  3. Initialize hlua interpreter
  4. Run inside:
    rex = require "rex_posix"

Expected result: library successfully loads Observed result:

error loading module 'rex_posix' from file '/home/mkpankov/projects/ciri/lua/lib/lua/5.2/rex_posix.so':
    /home/mkpankov/projects/ciri/lua/lib/lua/5.2/rex_posix.so: undefined symbol: lua_checkstack

Workaround: make sure your system has liblua5.2 installed and findable by pkg-config, so that build.rs finds this version - it won't trigger that bug, but requires liblua5.2.so shared library to be installed. So for example on Ubuntu 16.04 you'll need to have both liblua5.2-0 (shared library) and liblua5.2-dev (headers and static library) installed.

What follows is pretty obscure debugging info I gathered while trying to fix this issue. I'll post it in separate comment.

mkpankov commented 6 years ago

I have a binary crate that links to hlua and subsequently lua52-sys and native liblua52-0.so. Inside hlua interpreter I load a Lua library lrex-posix.

On my development machine, everything worked all the time w/o problems.

On my CI machine, when I try to load a Lua library inside of hlua interpreter, it errors:

error loading module 'rex_posix' from file '/home/jenkins/ciri/lua/lib/lua/5.2/rex_posix.so':
    /home/jenkins/ciri/lua/lib/lua/5.2/rex_posix.so: undefined symbol: lua_checkstack

I have the same Rust versions and cargo versions on both machines:

rustc 1.29.0 (aa3ca1994 2018-09-11)
cargo 1.29.0 (524a578d7 2018-08-05)

Dev machine is Ubuntu 16.04.

CI machine is a docker container running on Debian Stretch. Rust toolchain is installed via rustup and moved to persistent volume, which is mounted at /opt/cargo and inserted to workspace via

ln -sf /opt/cargo/cargo $HOME/.cargo'
ln -sf /opt/cargo/rustup $HOME/.rustup'

with subsequent source $HOME/.cargo/env.

I've tried installing fresh toolchain after container bootup instead of using symlinks and it didn't influence the error.

This symbol is located in /usr/lib/x86_64-linux-gnu/liblua5.2.so.0 on my dev machine

readelf -s /usr/lib/x86_64-linux-gnu/liblua5.2.so.0:

   211: 0000000000006880   146 FUNC    GLOBAL DEFAULT   13 lua_checkstack@@LUA_5.2

This library is properly NEEDED in dynamic section of ELF on my dev machine:

readelf -d target/debug/ciri:

 0x0000000000000001 (NEEDED)             Shared library: [liblua5.2.so.0]

This library isn't NEEDED in dynamic section of ELF on my CI machine.

So when Lua interpreter loads the library, it calls dl_open internally. I've been able to debug the dynamic loading and confirm the loader is unable to find the symbol because it doesn't even look for liblua5.2.so.

On my dev machine loading the rex-posix Lua library with LD_DEBUG=all LD_DEBUG_OUTPUT=ld-debug

      9994:     symbol=lua_checkstack;  lookup in file=target/debug/ciri [0]
      9994:     symbol=lua_checkstack;  lookup in file=/usr/lib/x86_64-linux-gnu/liblua5.2.so.0 [0]
      9994:     binding file /home/mkpankov/projects/ciri/lua/lib/lua/5.2/rex_posix.so [0] to /usr/lib/x86_64-linux-gnu/liblua5.2.so.0 [0]: normal symbol `lua_checkstack'

On my CI machine loading the rex-posix Lua library

      3966:     symbol=lua_checkstack;  lookup in file=target/debug/ciri [0]
      3966:     symbol=lua_checkstack;  lookup in file=/lib/x86_64-linux-gnu/libdl.so.2 [0]
      3966:     symbol=lua_checkstack;  lookup in file=/lib/x86_64-linux-gnu/librt.so.1 [0]
      3966:     symbol=lua_checkstack;  lookup in file=/lib/x86_64-linux-gnu/libpthread.so.0 [0]
      3966:     symbol=lua_checkstack;  lookup in file=/lib/x86_64-linux-gnu/libgcc_s.so.1 [0]
      3966:     symbol=lua_checkstack;  lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
      3966:     symbol=lua_checkstack;  lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
      3966:     symbol=lua_checkstack;  lookup in file=/lib/x86_64-linux-gnu/libm.so.6 [0]
      3966:     symbol=lua_checkstack;  lookup in file=/home/jenkins/ciri/lua/lib/lua/5.2/rex_posix.so [0]
      3966:     symbol=lua_checkstack;  lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
      3966:     symbol=lua_checkstack;  lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
      3966:     /home/jenkins/ciri/lua/lib/lua/5.2/rex_posix.so: error: symbol lookup error: undefined symbol: lua_checkstack (fatal)

Further debugging shows that in case Lua is compiled from source, compiled target/debug/liblua52_sys.rlib does contain aforementioned lua_checkstack symbol. In case system Lua is used via pkg-config, rlib doesn't contain lua_checkstack symbol.

When using this library in program, resulting binary also contains lua_checkstack symbol when Lua is compiled from source, and doesn't when Lua is found in the system.

Then, Lua, when loading the native library, uses dlopen in loadlib.c:135 and for some reason, it misses the symbol it searches for, even though it's in the binary, global and visible.

➜  ciri git:(master) ✗ readelf -s target/debug/ciri | grep lua_checkstack             
 24374: 0000000000525166   247 FUNC    GLOBAL DEFAULT   13 lua_checkstack

It seems as if when lua_checkstack is not in binary, surrounding code during linking fixes that by requesting proper liblua5.2.so. When it is in binary, linker thinks it will be found when dlopen tries to resolve its' symbols, but that doesn't happen for some reason.