llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.05k stars 11.98k forks source link

Infinite execve() loop on ppc64le with pre-loaded libasan #59114

Open mrc0mmand opened 1 year ago

mrc0mmand commented 1 year ago

Hey!

First of all, I'm really not sure on which side this issue actually is nor if this is a "supported" scenario at all.

In systemd when we test everything with ASan/UBSan, we wrap certain uninstrumented binaries (that load instrumented stuff) with a simple shell script to make them work with ASan without having to resort to setting LD_PRELOAD=<path-to-libasan.so> globally, since that leads to all sorts of issues. The wrapper is pretty straightforward - we move the original binary, suffix it with ".orig", and then place a shell script at the original location. For example, for /bin/su we'd move it to /bin/su.orig and place this script at /bin/su:

#!/bin/bash
# Preload the ASan runtime DSO, otherwise ASAn will complain
export LD_PRELOAD="/usr/lib64/clang/15.0.4/lib/libclang_rt.asan-x86_64.so"
# Disable LSan to speed things up, since we don't care about leak reports
# from 'external' binaries
export ASAN_OPTIONS=detect_leaks=0
# Set argv[0] to the original binary name without the ".orig" suffix
exec -a "$0" -- "/bin/su.orig" "$@"

This works pretty well and allows us to test systemd with ASan/UBSan without having to rebuild the whole system.

However, a couple of days back, I noticed a strange fail in one of the newer tests in the ppc64le job:

[90437.173819] testsuite-58.sh[68]: ++ su testuser -s /bin/sh -c 'XDG_RUNTIME_DIR=/run/user/$UID exec "$@"' -- sh mktemp --directory /tmp/test-repart.XXXXXXXXXX
[90440.809111] testsuite-58.sh[69]: ERROR: ld.so: object '/usr/lib64/clang/14.0.0/lib/libclang_rt.asan-powerpc64le.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
[90440.809419] testsuite-58.sh[69]: /usr/bin/su: error while loading shared libraries: libpam.so.0: cannot open shared object file: Error 24

where su is the wrapper above, calling /bin/su.orig.

After several hours of playing around, it looks like pre-loading ASan on ppc64le in case of the su binary causes an infinite execve() loop - but only on ppc64le, on x86_64 it works as it should:

x86_64

# cat /bin/su
#!/bin/bash
# Preload the ASan runtime DSO, otherwise ASAn will complain
export LD_PRELOAD="/usr/lib64/clang/15.0.4/lib/libclang_rt.asan-x86_64.so"
# Disable LSan to speed things up, since we don't care about leak reports
# from 'external' binaries
export ASAN_OPTIONS=detect_leaks=0
# Set argv[0] to the original binary name without the ".orig" suffix
exec -a "$0" -- "/bin/su.orig" "$@"
# su.orig --version
su.orig from util-linux 2.38.1

# strace -e execve,openat -- su --help
execve("/usr/bin/su", ["su", "--help"], 0x7ffeeacbc840 /* 31 vars */) = 0
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libtinfo.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
...
openat(AT_FDCWD, "/usr/bin/su", O_RDONLY) = 3
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x5616c4cc9ee0 /* 32 vars */) = 0
openat(AT_FDCWD, "/usr/lib64/clang/15.0.4/lib/libclang_rt.asan-x86_64.so", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libpam.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libpam_misc.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libaudit.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libeconf.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libcap-ng.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/proc/sys/kernel/cap_last_cap", O_RDONLY) = 3
openat(AT_FDCWD, "/proc/self/cmdline", O_RDONLY) = 4
openat(AT_FDCWD, "/proc/self/environ", O_RDONLY) = 4
...
openat(AT_FDCWD, "/usr/share/locale/en_US/LC_MESSAGES/util-linux.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en.UTF-8/LC_MESSAGES/util-linux.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en.utf8/LC_MESSAGES/util-linux.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en/LC_MESSAGES/util-linux.mo", O_RDONLY) = -1 ENOENT (No such file or directory)

Usage:
 su [options] [-] [<user> [<argument>...]]
...

ppc64le

# cat /bin/su
#!/bin/bash
# Preload the ASan runtime DSO, otherwise ASAn will complain
export LD_PRELOAD="/usr/lib64/clang/15.0.4/lib/libclang_rt.asan-powerpc64le.so"
# Disable LSan to speed things up, since we don't care about leak reports
# from 'external' binaries
export ASAN_OPTIONS=detect_leaks=0
# Set argv[0] to the original binary name without the ".orig" suffix
exec -a "$0" -- "/bin/su.orig" "$@"
# su.orig --version
su.orig from util-linux 2.38.1

# strace -e execve -- su --help
execve("/usr/bin/su", ["su", "--help"], 0x7fffe2d0c430 /* 46 vars */) = 0
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x1000cc045f0 /* 47 vars */) = 0
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x7fffedbfa3e0 /* 47 vars */) = 0
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x7ffffd5546a0 /* 47 vars */) = 0
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x7fffc128c2d0 /* 47 vars */) = 0
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x7fffec341d50 /* 47 vars */) = 0
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x7fffdb41dbc0 /* 47 vars */) = 0
...
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x7fffc467b1c0 /* 47 vars */) = 0
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x7ffff1592070 /* 47 vars */) = 0
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x7fffc50d5980 /* 47 vars */) = 0
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x7fffc48844a0 /* 47 vars */) = 0
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x7fffe9ac0950 /* 47 vars */) = 0
ERROR: ld.so: object '/usr/lib64/clang/15.0.4/lib/libclang_rt.asan-powerpc64le.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
/usr/bin/su: error while loading shared libraries: libpam.so.0: cannot open shared object file: Error 24
+++ exited with 127 +++

# strace -e execve,openat -- su --help
execve("/usr/bin/su", ["su", "--help"], 0x7fffec04f040 /* 46 vars */) = 0
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libtinfo.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/dev/tty", O_RDWR|O_NONBLOCK) = 3
...
openat(AT_FDCWD, "/usr/bin/su", O_RDONLY) = 3
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x10037b345f0 /* 47 vars */) = 0
openat(AT_FDCWD, "/usr/lib64/clang/15.0.4/lib/libclang_rt.asan-powerpc64le.so", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libpam.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libpam_misc.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libaudit.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libeconf.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libcap-ng.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/proc/sys/kernel/cap_last_cap", O_RDONLY) = 3
openat(AT_FDCWD, "/proc/self/cmdline", O_RDONLY) = 4
openat(AT_FDCWD, "/proc/self/environ", O_RDONLY) = 4
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x7fffe63f9560 /* 47 vars */) = 0
openat(AT_FDCWD, "/usr/lib64/clang/15.0.4/lib/libclang_rt.asan-powerpc64le.so", O_RDONLY|O_CLOEXEC) = 4
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 4
openat(AT_FDCWD, "/lib64/libpam.so.0", O_RDONLY|O_CLOEXEC) = 4
openat(AT_FDCWD, "/lib64/libpam_misc.so.0", O_RDONLY|O_CLOEXEC) = 4
openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 4
openat(AT_FDCWD, "/lib64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = 4
openat(AT_FDCWD, "/lib64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 4
openat(AT_FDCWD, "/lib64/libm.so.6", O_RDONLY|O_CLOEXEC) = 4
openat(AT_FDCWD, "/lib64/libaudit.so.1", O_RDONLY|O_CLOEXEC) = 4
openat(AT_FDCWD, "/lib64/libeconf.so.0", O_RDONLY|O_CLOEXEC) = 4
openat(AT_FDCWD, "/lib64/libcap-ng.so.0", O_RDONLY|O_CLOEXEC) = 4
openat(AT_FDCWD, "/proc/sys/kernel/cap_last_cap", O_RDONLY) = 4
openat(AT_FDCWD, "/proc/self/cmdline", O_RDONLY) = 5
openat(AT_FDCWD, "/proc/self/environ", O_RDONLY) = 5
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x7fffc45fd770 /* 47 vars */) = 0
openat(AT_FDCWD, "/usr/lib64/clang/15.0.4/lib/libclang_rt.asan-powerpc64le.so", O_RDONLY|O_CLOEXEC) = 5
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 5
openat(AT_FDCWD, "/lib64/libpam.so.0", O_RDONLY|O_CLOEXEC) = 5
openat(AT_FDCWD, "/lib64/libpam_misc.so.0", O_RDONLY|O_CLOEXEC) = 5
openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 5
openat(AT_FDCWD, "/lib64/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = 5
openat(AT_FDCWD, "/lib64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 5
openat(AT_FDCWD, "/lib64/libm.so.6", O_RDONLY|O_CLOEXEC) = 5
openat(AT_FDCWD, "/lib64/libaudit.so.1", O_RDONLY|O_CLOEXEC) = 5
openat(AT_FDCWD, "/lib64/libeconf.so.0", O_RDONLY|O_CLOEXEC) = 5
openat(AT_FDCWD, "/lib64/libcap-ng.so.0", O_RDONLY|O_CLOEXEC) = 5
openat(AT_FDCWD, "/proc/sys/kernel/cap_last_cap", O_RDONLY) = 5
openat(AT_FDCWD, "/proc/self/cmdline", O_RDONLY) = 6
openat(AT_FDCWD, "/proc/self/environ", O_RDONLY) = 6
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x7ffff8d5c270 /* 47 vars */) = 0
openat(AT_FDCWD, "/usr/lib64/clang/15.0.4/lib/libclang_rt.asan-powerpc64le.so", O_RDONLY|O_CLOEXEC) = 6
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 6
...

However, if I drop the LD_PRELOAD=... line from the su script, everthing goes back to normal:

# cat /bin/su
#!/bin/bash
# Disable LSan to speed things up, since we don't care about leak reports
# from 'external' binaries
export ASAN_OPTIONS=detect_leaks=0
# Set argv[0] to the original binary name without the ".orig" suffix
exec -a "$0" -- "/bin/su.orig" "$@"

# strace  -e execve,openat -- su --help
execve("/usr/bin/su", ["su", "--help"], 0x7fffee2d44b0 /* 46 vars */) = 0
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libtinfo.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/dev/tty", O_RDWR|O_NONBLOCK) = 3
...
openat(AT_FDCWD, "/usr/bin/su", O_RDONLY) = 3
execve("/bin/su.orig", ["/usr/bin/su", "--help"], 0x100127e45f0 /* 46 vars */) = 0
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libpam.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libpam_misc.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libaudit.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libeconf.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libcap-ng.so.0", O_RDONLY|O_CLOEXEC) = 3
...
openat(AT_FDCWD, "/usr/share/locale/en.utf8/LC_MESSAGES/util-linux.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en/LC_MESSAGES/util-linux.mo", O_RDONLY) = -1 ENOENT (No such file or directory)

Usage:
 su [options] [-] [<user> [<argument>...]]
...

After checking other binaries we do this with, they seem to be working fine-ish (i.e. no "infinite" loop), but the execve() calls are doubled:

ls (with wrapper, but without LD_PRELOAD=):

# strace -e execve -- ls -l ~
execve("/usr/bin/ls", ["ls", "-l", "/root"], 0x7fffcc26a798 /* 46 vars */) = 0
execve("/bin/ls.orig", ["/usr/bin/ls", "-l", "/root"], 0x1000ad545f0 /* 46 vars */) = 0
total 256
-rw-------. 1 root root 18565 Nov 21 12:18 anaconda-ks.cfg
-rw-r--r--. 1 root root     6 Nov 21 12:18 NETBOOT_METHOD.TXT
-rw-------. 1 root root 18582 Nov 21 12:23 original-ks.cfg
-rw-r--r--. 1 root root     9 Nov 21 12:18 RECIPE.TXT
+++ exited with 0 +++

ls (with wrapper and LD_PRELOAD=):

# strace -e execve -- ls -l ~
execve("/usr/bin/ls", ["ls", "-l", "/root"], 0x7fffdcb4f378 /* 46 vars */) = 0
execve("/bin/ls.orig", ["/usr/bin/ls", "-l", "/root"], 0x100092845f0 /* 47 vars */) = 0
execve("/bin/ls.orig", ["/usr/bin/ls", "-l", "/root"], 0x7fffead45308 /* 47 vars */) = 0
total 256
-rw-------. 1 root root 18565 Nov 21 12:18 anaconda-ks.cfg
-rw-r--r--. 1 root root     6 Nov 21 12:18 NETBOOT_METHOD.TXT
-rw-------. 1 root root 18582 Nov 21 12:23 original-ks.cfg
-rw-r--r--. 1 root root     9 Nov 21 12:18 RECIPE.TXT
+++ exited with 0 +++

Same with stat:

# Without LD_PRELOAD=
# strace -e execve -- stat ~
execve("/usr/bin/stat", ["stat", "/root"], 0x7fffdb495aa0 /* 46 vars */) = 0
execve("/bin/stat.orig", ["/usr/bin/stat", "/root"], 0x1003f7045f0 /* 46 vars */) = 0
# With LD_PRELOAD=
# strace -e execve -- stat ~
execve("/usr/bin/stat", ["stat", "/root"], 0x7ffff9d33f70 /* 46 vars */) = 0
execve("/bin/stat.orig", ["/usr/bin/stat", "/root"], 0x10004a145f0 /* 47 vars */) = 0
execve("/bin/stat.orig", ["/usr/bin/stat", "/root"], 0x7fffcd4e52b0 /* 47 vars */) = 0

etc. Not sure if that's expected, but at least the binaries seem to be working fine.

This was originally spotted on CentOS 8 Stream with compiler-rt-14.0.0-3.module_el8.7.0+1149+a59781f0.ppc64le, but it's reproducible on Fedora Rawhide with compiler-rt-15.0.4-1.fc38.ppc64le as well.

llvmbot commented 1 year ago

@llvm/issue-subscribers-backend-powerpc