Open marler8997 opened 4 years ago
Is there even a way to make a functional "embedded" loader that solves this problem properly on Linux, in theory?
Even if the binary embedded such a loader statically, what directories the system loader searches for libraries is platform-dependent - it's even possible to run two environments (glibc+musl) in parallel with some hacking, so musl-built binaries load their entirely separate set of libs located in e.g. /lib/x86_64-linux-musl/
or whatever path you want.
It looks like the Zig compiler already has logic to find the system's dynamic linker (see std/zig/system.zig
).
Given this, if we simply move that logic from link time to runtime then we have solution. At the cost of extra startup code, we can create exes that support as many distributions as we care to support.
I think by default it would be reasonable to change the default behavior based on whether we are compiling for native or cross targets. Find the system's dynamic linker at "link time" if we compile natively, and find it at runtime if we are cross compiling.
Update: I've gotten a proof of concept to work written in C using the zig cc
compiler. I'm able to compile a shared library, and then an executable that uses it, and run it without an ELF interpreter. see https://github.com/marler8997/reloader
$ git clone https://github.com/marler8997/reloader
$ cd reloader/c
$ make
...
$ ./out/app-nolibc
RELOADER: reloading with this loader: /nix/store/xg6ilb9g9zhi2zg1dpi4zcp288rhnvns-glibc-2.30/lib/ld-linux-x86-64.so.2
RELOADER: already reloaded
example message to print integer 1234: 1234
foopassthru(123) = 123
Success
$ patchelf --print-interpreter out/app-nolibc
cannot find .interp section
$ patchelf --print-needed out/app-nolibc
libfoo.so.0
Note that the purpose of this code right now is just to see if this can work in theory. There are many details to be worked out, but I believe I've been able to prove that this solution is possible.
As of now gcc/ld will compile a working app-nolibc
, but it adds the ELF interpreter, and I haven't figured out how to prevent it from doing that (setting -Wl,--dynamic-linker=
didn't work). Also removing the .interp
section after the fact with objcopy --remove-section .interp
causes the kernel to fail to load it with an "exec format error", so that will need to be figured out. But we know these issues are solvable because zig cc
is able to generate an exe that does work. Maybe we could add a new operation to patchelf
for this, patchelf --remove-interpreter
? Also note that this issue occurs if I link to libc whether I'm using zig cc
or gcc
, so using a new patchelf operation or adding additional options to the toolchains could be in order.
Take a look at https://github.com/Mic92/nix-ld
On my 64-bit NixOS machine, when I cross build to
x86_64-linux-gnu
, the ELF interpreter is set to/lib64/ld-linux-x86-64.so.2
, however, this loader does not exist on my NixOS distribution.Note that this only occurs when my executable is compiled dynamically (i.e. if I try to link libc), otherwise, no interpreter is required so the issue does not manifest.
I understand that a fix for this issue might prove difficult. There's no way to set an absolute path to a loader that will work on all distributions. The only fix I can think of is to never use an ELF interpreter. When we need dynamic libraries, we could compile the loader into the final executable...or maybe a better solution would be to include startup code that looks for the system loader that works on as many distributions as we can support.
Here is the code/build file to reproduce the issue: