warptools / ldshim

Other
5 stars 1 forks source link

[RFC] Perhaps a shim is not necessary? #1

Open zhuyifei1999 opened 1 year ago

zhuyifei1999 commented 1 year ago

I was reading https://zapps.app/technology/ yesterday and it ocurred to me that the shim process seemed extraneous.

ld.so is a ELF dynamic object, so is a dynamically linked executable:

$ readelf -h /lib64/ld-linux-x86-64.so.2 | grep Type
  Type:                              DYN (Shared object file)
$ readelf -h /bin/bash | grep Type
  Type:                              DYN (Position-Independent Executable file)

Both are able to run, because there's an entry point:

$ readelf -h /lib64/ld-linux-x86-64.so.2 | grep Entry
  Entry point address:               0x1b190
$ readelf -h /bin/bash | grep Entry
  Entry point address:               0x67020

But ld.so does not have an INTERP ELF segment:

$ readelf -lW /lib64/ld-linux-x86-64.so.2 | grep INTERP -A 1
$ readelf -lW /bin/bash | grep INTERP -A 1
  INTERP         0x000318 0x0000000000000318 0x0000000000000318 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]

This means, to start the execution of a dynamically linked object, the INTERP segment is optional as far as the kernel is concerned. If the object has no INTERP, the kernel will load it at a randomized address and jump to the entry point.

... which got me thinking, if the entry point is reachable, we can do in userspace the what used to be the kernel's job of loading ld.so into the address space, right? All we need to do is fix up the auxiliary vector to make ld.so believe nothing is out of the ordinary.

So here is the POC:

I chose to write it in assembly because I was too lazy to mess with compiler options. I just want something that will work regardless of compiler. Rewriting most of it in C is on my TODO. Only the initial stage of the entry point and then the jump to ld.so have to be assembly, but the rest should be convertible to C.

It also turned out that libc will require an INTERP segment when it's called (otherwise this assertion will occur: https://elixir.bootlin.com/glibc/glibc-2.36.9000/source/elf/rtld.c#L1291) so I patched it in at runtime. This unforunately meant that I have that page as RWX. I can re-mprotect it with the initial permissions but it needs a bit more code to find the right segment for the right permission bits.

Why do we care? I think one of the potential use cases where a jumploader could fail is in the case of binfmt-misc, or even setuids (though I'm not sure about the security of my approach either yet). In the case of binfmt-misc, the kernel would pass information about what's being executed to the binary via auxiliary vector in O mode. This would be lost upon a re-exec. And for setuids, invoking ld.so directly breaks setuid executable (though I'm not sure if this use case is something to support). And besides, saving an exec sounds cool since exec is very expensive process.

Anyways, here's a demo of the POC:

zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ make
gcc -shared -o absolute/lib.so lib.c
gcc -fPIC -o absolute/exe -L absolute -l:lib.so -Wl,-rpath=absolute exe.c
gcc -o tmp/strip_interp strip_interp.c
gcc -shared -o relative/lib.so lib.c
cp $(gcc --print-file-name=ld-linux-x86-64.so.2) relative/ld-linux-x86-64.so.2
cp $(gcc --print-file-name=libc.so.6) relative/libc.so.6
# gcc -o relative/exe -L relative -l:lib.so -Wl,-rpath=XORIGIN -Wl,-e_zapps_start -Wl,-Ild-linux-x86-64.so.2 tmp/zapps-crt0.o exe.c
gcc -o relative/exe -L relative -l:lib.so -Wl,-rpath=XORIGIN -Wl,-e_zapps_start tmp/zapps-crt0.o exe.c
# gcc -o relative/exe -L relative -l:lib.so -Wl,-rpath=XORIGIN -Wl,-e_zapps_start -Wl,--no-dynamic-linker tmp/zapps-crt0.o exe.c
# gcc -o relative/exe -L relative -l:lib.so -Wl,-rpath=XORIGIN exe.c
sed -i '0,/XORIGIN/{s/XORIGIN/$ORIGIN/}' relative/exe
tmp/strip_interp relative/exe

The absolute only have minimal rpath just to show what the output looks like for a normal executable:

zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ absolute/exe 
static_constructor in lib invoked
static_constructor in exe invoked
main invoked with arguments:
argv[0] = absolute/exe
foo invoked
contents of /proc/self/maps:
55eb90655000-55eb90656000 r--p 00000000 00:2b 55391978                   /home/zhuyifei1999/zapps-poc/absolute/exe
55eb90656000-55eb90657000 r-xp 00001000 00:2b 55391978                   /home/zhuyifei1999/zapps-poc/absolute/exe
55eb90657000-55eb90658000 r--p 00002000 00:2b 55391978                   /home/zhuyifei1999/zapps-poc/absolute/exe
55eb90658000-55eb90659000 r--p 00002000 00:2b 55391978                   /home/zhuyifei1999/zapps-poc/absolute/exe
55eb90659000-55eb9065a000 rw-p 00003000 00:2b 55391978                   /home/zhuyifei1999/zapps-poc/absolute/exe
55eb91c90000-55eb91cb1000 rw-p 00000000 00:00 0                          [heap]
7f8b8765f000-7f8b87662000 rw-p 00000000 00:00 0 
7f8b87662000-7f8b87684000 r--p 00000000 00:20 23718226                   /lib64/libc.so.6
7f8b87684000-7f8b877d9000 r-xp 00022000 00:20 23718226                   /lib64/libc.so.6
7f8b877d9000-7f8b8782b000 r--p 00177000 00:20 23718226                   /lib64/libc.so.6
7f8b8782b000-7f8b8782f000 r--p 001c9000 00:20 23718226                   /lib64/libc.so.6
7f8b8782f000-7f8b87831000 rw-p 001cd000 00:20 23718226                   /lib64/libc.so.6
7f8b87831000-7f8b87839000 rw-p 00000000 00:00 0 
7f8b8785d000-7f8b8785e000 r--p 00000000 00:2b 55391977                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7f8b8785e000-7f8b8785f000 r-xp 00001000 00:2b 55391977                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7f8b8785f000-7f8b87860000 r--p 00002000 00:2b 55391977                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7f8b87860000-7f8b87861000 r--p 00002000 00:2b 55391977                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7f8b87861000-7f8b87862000 rw-p 00003000 00:2b 55391977                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7f8b87862000-7f8b87864000 rw-p 00000000 00:00 0 
7f8b87864000-7f8b87865000 r--p 00000000 00:20 23718239                   /lib64/ld-linux-x86-64.so.2
7f8b87865000-7f8b8788b000 r-xp 00001000 00:20 23718239                   /lib64/ld-linux-x86-64.so.2
7f8b8788b000-7f8b87895000 r--p 00027000 00:20 23718239                   /lib64/ld-linux-x86-64.so.2
7f8b87895000-7f8b87897000 r--p 00031000 00:20 23718239                   /lib64/ld-linux-x86-64.so.2
7f8b87897000-7f8b87899000 rw-p 00033000 00:20 23718239                   /lib64/ld-linux-x86-64.so.2
7fffacba6000-7fffacbc8000 rw-p 00000000 00:00 0                          [stack]
7fffacbf6000-7fffacbfa000 r--p 00000000 00:00 0                          [vvar]
7fffacbfa000-7fffacbfc000 r-xp 00000000 00:00 0                          [vdso]

It cannot be relocated:

zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ mv absolute foo
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ foo/exe 
foo/exe: error while loading shared libraries: lib.so: cannot open shared object file: No such file or directory

And this is the relocatable:

zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ relative/exe 
static_constructor in lib invoked
static_constructor in exe invoked
main invoked with arguments:
argv[0] = relative/exe
foo invoked
contents of /proc/self/maps:
555555c0e000-555555c2f000 rw-p 00000000 00:00 0                          [heap]
7f4453e1b000-7f4453e1e000 rw-p 00000000 00:00 0 
7f4453e1e000-7f4453e40000 r--p 00000000 00:2b 55356014                   /home/zhuyifei1999/zapps-poc/relative/libc.so.6
7f4453e40000-7f4453f95000 r-xp 00022000 00:2b 55356014                   /home/zhuyifei1999/zapps-poc/relative/libc.so.6
7f4453f95000-7f4453fe7000 r--p 00177000 00:2b 55356014                   /home/zhuyifei1999/zapps-poc/relative/libc.so.6
7f4453fe7000-7f4453feb000 r--p 001c9000 00:2b 55356014                   /home/zhuyifei1999/zapps-poc/relative/libc.so.6
7f4453feb000-7f4453fed000 rw-p 001cd000 00:2b 55356014                   /home/zhuyifei1999/zapps-poc/relative/libc.so.6
7f4453fed000-7f4453ff5000 rw-p 00000000 00:00 0 
7f4453ff5000-7f4453ff6000 r--p 00000000 00:2b 55391980                   /home/zhuyifei1999/zapps-poc/relative/lib.so
7f4453ff6000-7f4453ff7000 r-xp 00001000 00:2b 55391980                   /home/zhuyifei1999/zapps-poc/relative/lib.so
7f4453ff7000-7f4453ff8000 r--p 00002000 00:2b 55391980                   /home/zhuyifei1999/zapps-poc/relative/lib.so
7f4453ff8000-7f4453ff9000 r--p 00002000 00:2b 55391980                   /home/zhuyifei1999/zapps-poc/relative/lib.so
7f4453ff9000-7f4453ffa000 rw-p 00003000 00:2b 55391980                   /home/zhuyifei1999/zapps-poc/relative/lib.so
7f4453ffa000-7f4453ffc000 rw-p 00000000 00:00 0 
7f4453ffc000-7f4453ffd000 r--p 00000000 00:2b 55356013                   /home/zhuyifei1999/zapps-poc/relative/ld-linux-x86-64.so.2
7f4453ffd000-7f4454023000 r-xp 00001000 00:2b 55356013                   /home/zhuyifei1999/zapps-poc/relative/ld-linux-x86-64.so.2
7f4454023000-7f445402d000 r--p 00027000 00:2b 55356013                   /home/zhuyifei1999/zapps-poc/relative/ld-linux-x86-64.so.2
7f445402d000-7f445402f000 r--p 00031000 00:2b 55356013                   /home/zhuyifei1999/zapps-poc/relative/ld-linux-x86-64.so.2
7f445402f000-7f4454031000 rw-p 00033000 00:2b 55356013                   /home/zhuyifei1999/zapps-poc/relative/ld-linux-x86-64.so.2
7f4454031000-7f4454032000 rwxp 00000000 00:2b 55391982                   /home/zhuyifei1999/zapps-poc/relative/exe
7f4454032000-7f4454033000 r-xp 00001000 00:2b 55391982                   /home/zhuyifei1999/zapps-poc/relative/exe
7f4454033000-7f4454034000 r--p 00002000 00:2b 55391982                   /home/zhuyifei1999/zapps-poc/relative/exe
7f4454034000-7f4454035000 r--p 00002000 00:2b 55391982                   /home/zhuyifei1999/zapps-poc/relative/exe
7f4454035000-7f4454036000 rw-p 00003000 00:2b 55391982                   /home/zhuyifei1999/zapps-poc/relative/exe
7fff889ac000-7fff889ce000 rw-p 00000000 00:00 0                          [stack]
7fff889e7000-7fff889eb000 r--p 00000000 00:00 0                          [vvar]
7fff889eb000-7fff889ed000 r-xp 00000000 00:00 0                          [vdso]

... which is relocatable like other zapps:

zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ mv relative bar
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ bar/exe 
static_constructor in lib invoked
static_constructor in exe invoked
main invoked with arguments:
argv[0] = bar/exe
foo invoked
contents of /proc/self/maps:
5555565b1000-5555565d2000 rw-p 00000000 00:00 0                          [heap]
7fe1993ee000-7fe1993f1000 rw-p 00000000 00:00 0 
7fe1993f1000-7fe199413000 r--p 00000000 00:2b 55356014                   /home/zhuyifei1999/zapps-poc/bar/libc.so.6
7fe199413000-7fe199568000 r-xp 00022000 00:2b 55356014                   /home/zhuyifei1999/zapps-poc/bar/libc.so.6
7fe199568000-7fe1995ba000 r--p 00177000 00:2b 55356014                   /home/zhuyifei1999/zapps-poc/bar/libc.so.6
7fe1995ba000-7fe1995be000 r--p 001c9000 00:2b 55356014                   /home/zhuyifei1999/zapps-poc/bar/libc.so.6
7fe1995be000-7fe1995c0000 rw-p 001cd000 00:2b 55356014                   /home/zhuyifei1999/zapps-poc/bar/libc.so.6
7fe1995c0000-7fe1995c8000 rw-p 00000000 00:00 0 
7fe1995c8000-7fe1995c9000 r--p 00000000 00:2b 55391980                   /home/zhuyifei1999/zapps-poc/bar/lib.so
7fe1995c9000-7fe1995ca000 r-xp 00001000 00:2b 55391980                   /home/zhuyifei1999/zapps-poc/bar/lib.so
7fe1995ca000-7fe1995cb000 r--p 00002000 00:2b 55391980                   /home/zhuyifei1999/zapps-poc/bar/lib.so
7fe1995cb000-7fe1995cc000 r--p 00002000 00:2b 55391980                   /home/zhuyifei1999/zapps-poc/bar/lib.so
7fe1995cc000-7fe1995cd000 rw-p 00003000 00:2b 55391980                   /home/zhuyifei1999/zapps-poc/bar/lib.so
7fe1995cd000-7fe1995cf000 rw-p 00000000 00:00 0 
7fe1995cf000-7fe1995d0000 r--p 00000000 00:2b 55356013                   /home/zhuyifei1999/zapps-poc/bar/ld-linux-x86-64.so.2
7fe1995d0000-7fe1995f6000 r-xp 00001000 00:2b 55356013                   /home/zhuyifei1999/zapps-poc/bar/ld-linux-x86-64.so.2
7fe1995f6000-7fe199600000 r--p 00027000 00:2b 55356013                   /home/zhuyifei1999/zapps-poc/bar/ld-linux-x86-64.so.2
7fe199600000-7fe199602000 r--p 00031000 00:2b 55356013                   /home/zhuyifei1999/zapps-poc/bar/ld-linux-x86-64.so.2
7fe199602000-7fe199604000 rw-p 00033000 00:2b 55356013                   /home/zhuyifei1999/zapps-poc/bar/ld-linux-x86-64.so.2
7fe199604000-7fe199605000 rwxp 00000000 00:2b 55391982                   /home/zhuyifei1999/zapps-poc/bar/exe
7fe199605000-7fe199606000 r-xp 00001000 00:2b 55391982                   /home/zhuyifei1999/zapps-poc/bar/exe
7fe199606000-7fe199607000 r--p 00002000 00:2b 55391982                   /home/zhuyifei1999/zapps-poc/bar/exe
7fe199607000-7fe199608000 r--p 00002000 00:2b 55391982                   /home/zhuyifei1999/zapps-poc/bar/exe
7fe199608000-7fe199609000 rw-p 00003000 00:2b 55391982                   /home/zhuyifei1999/zapps-poc/bar/exe
7ffd3ea59000-7ffd3ea7b000 rw-p 00000000 00:00 0                          [stack]
7ffd3eab4000-7ffd3eab8000 r--p 00000000 00:00 0                          [vvar]
7ffd3eab8000-7ffd3eaba000 r-xp 00000000 00:00 0                          [vdso]

Wdyt?

zhuyifei1999 commented 1 year ago

I rewrote the logic in C so it'll hopefully be more clear https://github.com/zhuyifei1999/zapps-poc/blob/master/zapps-crt0.c

warpfork commented 1 year ago

Awesome.

warpfork commented 1 year ago

I agree the shim process part of Zapps is potentially extraneous, and a somewhat unattractive hack.

Much of what we've shipped and called Zapps so far is using a very placeholder, minimum-viable-product implementation of the shim, as well. It's big, bulky, and certainly can be improved -- even within the category of shims that are doing multiple exec's.

And this goes somewhere even further than the multiple-exec space. Super awesome. I had some idea that something like this should be possible, but it's completely spectacular to see it fully worked, and even proven by working demo.

warpfork commented 1 year ago

So I guess most questions that occur to me will be about what kinda of interfaces/ABI/behavioral-contracts this becomes sensitive to, and what that will imply about portability/maintainability/fragility.

A few questions and scattered first impressions, in no particular order:

I look forward to looking at this in greater depth soon!

zhuyifei1999 commented 1 year ago

I see in the c version, you calmly reimplement a good chunk of a minimal libc. (I completely understand why, especially compared to glibc.)

I think a lot of the functions are of two kinds - string functions and syscall functions.

Especially the parts unpacking e.g. Elf64_auxv_t. Do we need to own that? Is there a way we could avoid it?

Do you mean _zapps_getauxval_ptr? I'm not sure there's a way to guarantee the position of each auxv in the auxv array. As for the part where it skips argc argv envp, it's required to find auxv itself. The logic is very silmilar to both glibc and musl:

And we need to patch auxv to provide the _start symbol to ld.so via AT_ENTRY vector, even if we ignore how I patched AT_BASE too.

I certainly notice the .S is quite short!

It's basically the same code heh, slightly inferior. The commit message (https://github.com/zhuyifei1999/zapps-poc/commit/916205660971ee42c54a7106730a01e7e802c7dd) describes what changed in functionality in the rewrite.

What're the boundaries of portability for this? For example I see a lot of small details which are setting things up with attention to what glibc and the matching gnu ELF interp require... Is all this going to work transparently equally well if we try to deploy it rolled together with a program compiled against musl or some other libc, without modifications?

I don't see why it would cause an issue. That patch to PT_INTERP is to revert it back from PT_ZAPPS_INTERP (which is done by strip_interp.c). My main concern for something breaks is if a libc cares about the not just the name of ld.so, but also the path of it, from reading PT_INTERP segment. I think the most likely candidate for a libc that would care is glibc, but if it doesn't I don't find it likely that other libcs would care at all.

That said, I haven't tested with musl or other libcs yet so I'm not 100% sure.

The patching away of the problematic glibc assert at runtime is neat and evidently blasts its way through a problem, but also seems kinda scary. I'm worried that changing executable pages of memory at runtime is going to trip virus detection heuristics or other security boundaries that may exist in some contexts.

In the C version I used a pwrite on /proc/self/mem instead, which seems like a much better way to do this (https://offlinemark.com/2021/05/12/an-obscure-quirk-of-proc/). No more RWX ;)

zhuyifei1999 commented 1 year ago

Wait I misread

The patching away of the problematic glibc assert at runtime

No I didn't do this. The patch was to revert PT_INTERP back from PT_ZAPPS_INTERP. This doesn't modify any glibc code, but only the segmen headers of the ELF we are loading. The X bit was set because certain older gccs, .text starts within the first mapped page.

I use the PT_ZAPPS_INTERP patch so the kernel invokes the executable directly without an interpreter.

zhuyifei1999 commented 1 year ago

That said, I haven't tested with musl or other libcs yet so I'm not 100% sure.

I tested musl with a cross-compiler. Interestingly in musl ld.so symlinks to libc.so:

zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ ls -l /usr/x86_64-pc-linux-musl/lib/ld-musl-x86_64.so.1
lrwxrwxrwx 1 root root 41 Dec 30 18:43 /usr/x86_64-pc-linux-musl/lib/ld-musl-x86_64.so.1 -> /usr/x86_64-pc-linux-musl/usr/lib/libc.so
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ ls -l /usr/x86_64-pc-linux-musl/usr/lib/libc.so
-rwxr-xr-x 1 root root 817656 Dec 30 18:43 /usr/x86_64-pc-linux-musl/usr/lib/libc.so

And it defines PAGE_SIZE and lacks <error.h>, so with that in mind, I did https://github.com/zhuyifei1999/zapps-poc/commit/48bb4c675ee5faeaedecbbd6b316b1b3bcd003ff

And it totally just works:

zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ make CC=x86_64-pc-linux-musl-gcc
mkdir -p absolute
x86_64-pc-linux-musl-gcc -o absolute/lib.so lib.c -fPIC -shared -g -Os -pipe
x86_64-pc-linux-musl-gcc -o absolute/exe exe.c -L absolute -l:lib.so -Wl,-rpath=absolute -g -Os -pipe
mkdir -p tmp
x86_64-pc-linux-musl-gcc -o tmp/strip_interp strip_interp.c -g -Os -pipe
x86_64-pc-linux-musl-gcc -o tmp/zapps-crt0.o zapps-crt0.c -fPIC -ffreestanding -fno-merge-constants -c -g -Os -pipe
mkdir -p relative
x86_64-pc-linux-musl-gcc -o relative/lib.so lib.c -fPIC -shared -g -Os -pipe
cp $(x86_64-pc-linux-musl-gcc --print-file-name=libc.so) relative/libc.so
x86_64-pc-linux-musl-gcc -o relative/exe exe.c -L relative -l:lib.so -Wl,-rpath=XORIGIN -Wl,-e_zapps_start -Wl,--unique=.text.zapps tmp/zapps-crt0.o -g -Os -pipe
sed -i '0,/XORIGIN/{s/XORIGIN/$ORIGIN/}' relative/exe
relative/libc.so tmp/strip_interp relative/exe
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ relative/exe
static_constructor in lib invoked
static_constructor in exe invoked
main invoked with arguments:
argv[0] = relative/exe
foo invoked
contents of /proc/self/maps:
555556007000-555556008000 ---p 00000000 00:00 0                          [heap]
555556008000-555556009000 rw-p 00000000 00:00 0                          [heap]
7f349a3a8000-7f349a3a9000 r--p 00000000 00:2b 55447740                   /home/zhuyifei1999/zapps-poc/relative/lib.so
7f349a3a9000-7f349a3aa000 r-xp 00001000 00:2b 55447740                   /home/zhuyifei1999/zapps-poc/relative/lib.so
7f349a3aa000-7f349a3ab000 r--p 00002000 00:2b 55447740                   /home/zhuyifei1999/zapps-poc/relative/lib.so
7f349a3ab000-7f349a3ac000 r--p 00002000 00:2b 55447740                   /home/zhuyifei1999/zapps-poc/relative/lib.so
7f349a3ac000-7f349a3ad000 rw-p 00003000 00:2b 55447740                   /home/zhuyifei1999/zapps-poc/relative/lib.so
7f349a3ad000-7f349a3c2000 r--p 00000000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7f349a3c2000-7f349a43e000 r-xp 00015000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7f349a43e000-7f349a474000 r--p 00091000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7f349a474000-7f349a475000 r--p 000c6000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7f349a475000-7f349a476000 rw-p 000c7000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7f349a476000-7f349a479000 rw-p 00000000 00:00 0 
7f349a479000-7f349a47a000 r--p 00000000 00:2b 55447743                   /home/zhuyifei1999/zapps-poc/relative/exe
7f349a47a000-7f349a47b000 r-xp 00001000 00:2b 55447743                   /home/zhuyifei1999/zapps-poc/relative/exe
7f349a47b000-7f349a47c000 r--p 00002000 00:2b 55447743                   /home/zhuyifei1999/zapps-poc/relative/exe
7f349a47c000-7f349a47d000 r--p 00002000 00:2b 55447743                   /home/zhuyifei1999/zapps-poc/relative/exe
7f349a47d000-7f349a47e000 rw-p 00003000 00:2b 55447743                   /home/zhuyifei1999/zapps-poc/relative/exe
7ffd719e6000-7ffd71a08000 rw-p 00000000 00:00 0                          [stack]
7ffd71b04000-7ffd71b08000 r--p 00000000 00:00 0                          [vvar]
7ffd71b08000-7ffd71b0a000 r-xp 00000000 00:00 0                          [vdso]
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ cp -r relative/ /tmp/bar
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ cd /mnt
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a /mnt $ /tmp/bar/exe
static_constructor in lib invoked
static_constructor in exe invoked
main invoked with arguments:
argv[0] = /tmp/bar/exe
foo invoked
contents of /proc/self/maps:
555556bdb000-555556bdc000 ---p 00000000 00:00 0                          [heap]
555556bdc000-555556bdd000 rw-p 00000000 00:00 0                          [heap]
7f7cce472000-7f7cce473000 r--p 00000000 00:22 31942                      /tmp/bar/lib.so
7f7cce473000-7f7cce474000 r-xp 00001000 00:22 31942                      /tmp/bar/lib.so
7f7cce474000-7f7cce475000 r--p 00002000 00:22 31942                      /tmp/bar/lib.so
7f7cce475000-7f7cce476000 r--p 00002000 00:22 31942                      /tmp/bar/lib.so
7f7cce476000-7f7cce477000 rw-p 00003000 00:22 31942                      /tmp/bar/lib.so
7f7cce477000-7f7cce48c000 r--p 00000000 00:22 31943                      /tmp/bar/libc.so
7f7cce48c000-7f7cce508000 r-xp 00015000 00:22 31943                      /tmp/bar/libc.so
7f7cce508000-7f7cce53e000 r--p 00091000 00:22 31943                      /tmp/bar/libc.so
7f7cce53e000-7f7cce53f000 r--p 000c6000 00:22 31943                      /tmp/bar/libc.so
7f7cce53f000-7f7cce540000 rw-p 000c7000 00:22 31943                      /tmp/bar/libc.so
7f7cce540000-7f7cce543000 rw-p 00000000 00:00 0 
7f7cce543000-7f7cce544000 r--p 00000000 00:22 31944                      /tmp/bar/exe
7f7cce544000-7f7cce545000 r-xp 00001000 00:22 31944                      /tmp/bar/exe
7f7cce545000-7f7cce546000 r--p 00002000 00:22 31944                      /tmp/bar/exe
7f7cce546000-7f7cce547000 r--p 00002000 00:22 31944                      /tmp/bar/exe
7f7cce547000-7f7cce548000 rw-p 00003000 00:22 31944                      /tmp/bar/exe
7ffe70e0a000-7ffe70e2c000 rw-p 00000000 00:00 0                          [stack]
7ffe70eed000-7ffe70ef1000 r--p 00000000 00:00 0                          [vvar]
7ffe70ef1000-7ffe70ef3000 r-xp 00000000 00:00 0                          [vdso]
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a /mnt $ cd /usr/tmp
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a /usr/tmp $ ln -s ../../tmp/bar/exe baz
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a /usr/tmp $ ./baz
static_constructor in lib invoked
static_constructor in exe invoked
main invoked with arguments:
argv[0] = ./baz
foo invoked
contents of /proc/self/maps:
555555f25000-555555f26000 ---p 00000000 00:00 0                          [heap]
555555f26000-555555f27000 rw-p 00000000 00:00 0                          [heap]
7f59bca77000-7f59bca78000 r--p 00000000 00:22 31942                      /tmp/bar/lib.so
7f59bca78000-7f59bca79000 r-xp 00001000 00:22 31942                      /tmp/bar/lib.so
7f59bca79000-7f59bca7a000 r--p 00002000 00:22 31942                      /tmp/bar/lib.so
7f59bca7a000-7f59bca7b000 r--p 00002000 00:22 31942                      /tmp/bar/lib.so
7f59bca7b000-7f59bca7c000 rw-p 00003000 00:22 31942                      /tmp/bar/lib.so
7f59bca7c000-7f59bca91000 r--p 00000000 00:22 31943                      /tmp/bar/libc.so
7f59bca91000-7f59bcb0d000 r-xp 00015000 00:22 31943                      /tmp/bar/libc.so
7f59bcb0d000-7f59bcb43000 r--p 00091000 00:22 31943                      /tmp/bar/libc.so
7f59bcb43000-7f59bcb44000 r--p 000c6000 00:22 31943                      /tmp/bar/libc.so
7f59bcb44000-7f59bcb45000 rw-p 000c7000 00:22 31943                      /tmp/bar/libc.so
7f59bcb45000-7f59bcb48000 rw-p 00000000 00:00 0 
7f59bcb48000-7f59bcb49000 r--p 00000000 00:22 31944                      /tmp/bar/exe
7f59bcb49000-7f59bcb4a000 r-xp 00001000 00:22 31944                      /tmp/bar/exe
7f59bcb4a000-7f59bcb4b000 r--p 00002000 00:22 31944                      /tmp/bar/exe
7f59bcb4b000-7f59bcb4c000 r--p 00002000 00:22 31944                      /tmp/bar/exe
7f59bcb4c000-7f59bcb4d000 rw-p 00003000 00:22 31944                      /tmp/bar/exe
7ffd64bb4000-7ffd64bd6000 rw-p 00000000 00:00 0                          [stack]
7ffd64bf6000-7ffd64bfa000 r--p 00000000 00:00 0                          [vvar]
7ffd64bfa000-7ffd64bfc000 r-xp 00000000 00:00 0                          [vdso]

As a control group comparison, this is invoking the ld.so directly on the normal binary:

zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ relative/libc.so absolute/exe 
static_constructor in lib invoked
static_constructor in exe invoked
main invoked with arguments:
argv[0] = absolute/exe
foo invoked
contents of /proc/self/maps:
5555558cc000-5555558cd000 ---p 00000000 00:00 0                          [heap]
5555558cd000-5555558ce000 rw-p 00000000 00:00 0                          [heap]
7f2071bc0000-7f2071bc1000 r--p 00000000 00:2b 55447734                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7f2071bc1000-7f2071bc2000 r-xp 00001000 00:2b 55447734                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7f2071bc2000-7f2071bc3000 r--p 00002000 00:2b 55447734                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7f2071bc3000-7f2071bc4000 r--p 00002000 00:2b 55447734                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7f2071bc4000-7f2071bc5000 rw-p 00003000 00:2b 55447734                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7f2071bc5000-7f2071bc6000 r--p 00000000 00:2b 55447735                   /home/zhuyifei1999/zapps-poc/absolute/exe
7f2071bc6000-7f2071bc7000 r-xp 00001000 00:2b 55447735                   /home/zhuyifei1999/zapps-poc/absolute/exe
7f2071bc7000-7f2071bc8000 r--p 00002000 00:2b 55447735                   /home/zhuyifei1999/zapps-poc/absolute/exe
7f2071bc8000-7f2071bc9000 r--p 00002000 00:2b 55447735                   /home/zhuyifei1999/zapps-poc/absolute/exe
7f2071bc9000-7f2071bca000 rw-p 00003000 00:2b 55447735                   /home/zhuyifei1999/zapps-poc/absolute/exe
7f2071bca000-7f2071bdf000 r--p 00000000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7f2071bdf000-7f2071c5b000 r-xp 00015000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7f2071c5b000-7f2071c91000 r--p 00091000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7f2071c91000-7f2071c92000 r--p 000c6000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7f2071c92000-7f2071c93000 rw-p 000c7000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7f2071c93000-7f2071c96000 rw-p 00000000 00:00 0 
7ffdc47e7000-7ffdc4809000 rw-p 00000000 00:00 0                          [stack]
7ffdc4972000-7ffdc4976000 r--p 00000000 00:00 0                          [vvar]
7ffdc4976000-7ffdc4978000 r-xp 00000000 00:00 0                          [vdso]

And this is if musl ld.so is loaded by kernel as the interpreter:

zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ patchelf --set-interpreter relative/libc.so absolute/exe
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a ~/zapps-poc $ absolute/exe
static_constructor in lib invoked
static_constructor in exe invoked
main invoked with arguments:
argv[0] = absolute/exe
foo invoked
contents of /proc/self/maps:
561685a4e000-561685a4f000 r--p 00000000 00:2b 55447735                   /home/zhuyifei1999/zapps-poc/absolute/exe
561685a4f000-561685a50000 r-xp 00001000 00:2b 55447735                   /home/zhuyifei1999/zapps-poc/absolute/exe
561685a50000-561685a51000 r--p 00002000 00:2b 55447735                   /home/zhuyifei1999/zapps-poc/absolute/exe
561685a51000-561685a52000 r--p 00002000 00:2b 55447735                   /home/zhuyifei1999/zapps-poc/absolute/exe
561685a52000-561685a53000 rw-p 00003000 00:2b 55447735                   /home/zhuyifei1999/zapps-poc/absolute/exe
561685a53000-561685a54000 rw-p 00005000 00:2b 55447735                   /home/zhuyifei1999/zapps-poc/absolute/exe
561687414000-561687415000 ---p 00000000 00:00 0                          [heap]
561687415000-561687416000 rw-p 00000000 00:00 0                          [heap]
7fcaf43cb000-7fcaf43cc000 r--p 00000000 00:2b 55447734                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7fcaf43cc000-7fcaf43cd000 r-xp 00001000 00:2b 55447734                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7fcaf43cd000-7fcaf43ce000 r--p 00002000 00:2b 55447734                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7fcaf43ce000-7fcaf43cf000 r--p 00002000 00:2b 55447734                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7fcaf43cf000-7fcaf43d0000 rw-p 00003000 00:2b 55447734                   /home/zhuyifei1999/zapps-poc/absolute/lib.so
7fcaf43d0000-7fcaf43e5000 r--p 00000000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7fcaf43e5000-7fcaf4461000 r-xp 00015000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7fcaf4461000-7fcaf4497000 r--p 00091000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7fcaf4497000-7fcaf4498000 r--p 000c6000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7fcaf4498000-7fcaf4499000 rw-p 000c7000 00:2b 55447741                   /home/zhuyifei1999/zapps-poc/relative/libc.so
7fcaf4499000-7fcaf449c000 rw-p 00000000 00:00 0 
7ffe6d728000-7ffe6d74a000 rw-p 00000000 00:00 0                          [stack]
7ffe6d754000-7ffe6d758000 r--p 00000000 00:00 0                          [vvar]
7ffe6d758000-7ffe6d75a000 r-xp 00000000 00:00 0                          [vdso]

So yeah, I think this method is pretty libc-agnostic :) other than the file name of the ld.so needs to be changed.

zhuyifei1999 commented 1 year ago

I also took a look at musl source code to see if the path of ld.so is a concern (and musl source is much easier to read than glibc), since earlier I said

I think the most likely candidate for a libc that would care is glibc, but if it doesn't I don't find it likely that other libcs would care at all.

warpfork commented 1 year ago

Do you have any thoughts on what the next steps would be, and how we should integrate this?

I'm thinking we may want to maintain both the older boring'er way for a while, and try this in parallel. (I'm very conservative in some regards.) I don't have strong opinions about how we implement this, but if you have ideas I'm receptive.

By the way, I see you published your new code under an MIT license, and I'm happy for any open-source license like that. But just as a heads up, we tend to do Apache2-OR-MIT for other stuff, including our existing code in this repo. (This is largely guided by the licensing policies of Protocol Labs, which affects several of us by default, and is also very open-source oriented and generally pretty pleasing.) I don't think it's any problem to have code that's MIT-only floating around, but if it's okay by you to also make it Apache2-OR-MIT, it might create moderately less kerfuffle if we decide to combine these works all into one repo.

zhuyifei1999 commented 1 year ago

but if it's okay by you to also make it Apache2-OR-MIT

Sure

zhuyifei1999 commented 1 year ago

Done. https://github.com/zhuyifei1999/zapps-poc/commit/3fdfc382a053db88c72346e01cde37d97ed18ec8

Do you have any thoughts on what the next steps would be, and how we should integrate this?

I'm not familiar with how warptools does the building & packaging, so I don't think I'm the right person to ask about this. What I did was more of just a proof that the extra exec need not exist and what procedures are necessary to get rid of it.

That said, I don't see it as any more complex than building a zapps-crt0.o in some known location and for most projects add an LDFLAGS to use it.

warpfork commented 1 year ago

I have another dumb question that has come to my mind :)

With this shimless crt0 approach: if we wanted to add in some further preamble code to manipulate the environment variables the process sees, what options do we have? Will it be possible to do so with consistency?

We haven't done this in the ldshim repo as it stands yet either, to be clear. But it's looking like something that might slip into our scope in the future.

For context on why: tl;dr: it seems practical. (Perhaps a bit kludgey, also. But practical.) It seems that, in addition to our linker tricks to make the initial library loading work correctly and path-agnostically, there's a good number of programs out in the wild that that will also require some additional kicking in the shins to behave path-agnostically at runtime. And sometimes environment variables we set at launch are going to be the easiest (read: not patching at build time) way to do this.

(We've started to see a few of these already. Also, Joey Hess (who has apparently been barking up the same tree as zapps for quite a while!) pointed out to us that in his experience working in a similar direction, there were lots of environment variables he found as the most sensible way to request fully path-agnostic behaviors at runtime. Examples he mentioned included LOCPATH, GCONV_PATH, GIT_TEMPLATE_DIR, MANPATH... but of course the individual cases aren't the point; more that it seems to come up fairly often in practice.)

(I'll also say: I still don't love environment variables as "the" way to solve these problems -- the inheritability of env vars in particular just doesn't feel correct or clean at all in these scenarios. But it's looking like something we should have a place for in our bag of tricks.)

zhuyifei1999 commented 1 year ago

With this shimless crt0 approach: if we wanted to add in some further preamble code to manipulate the environment variables the process sees, what options do we have? Will it be possible to do so with consistency?

I mean, the variables are on stack, and the stack has a very well defined layout (https://lwn.net/Articles/631631/). You can do whatever you want to, adding env vars, modifying them, or deleting them, before passing the control over to the dynamic loader.

To be honest I also think env variables are wrong. If bash is a zapp, and someone uses it to call a non-zapp program the non-zapp program should not inherit bash's workarounds. Similarly if strace is a zapp and someone passes a non-zapp for strace to invoke, that non-zapp shouldn't inherit strace zapp's workarounds.