JonathonReinhart / staticx

Create static executable from dynamic executable
https://staticx.readthedocs.io/
Other
345 stars 37 forks source link

Building an Aarch64 executable with staticx on CentOS on creates a crashing bootloader #213

Open Jongy opened 2 years ago

Jongy commented 2 years ago

Taken from my comment in https://github.com/JonathonReinhart/staticx/issues/181#issuecomment-915616425:

I can confirm that staticx works just fine when running on an Aarch64 machine. As you mentioned, there's no wheel for Aarch64 so it gets built locally. However, when running the staticx build in a Dockerfile cross-built on x86_64, the resulting binary crashes with SIGSEGV. It runs just fine when I tried it with gdb. The stack trace from gdb and a core file (generated in a "clean" run) is:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000ffffa4909848 in __GI___libc_malloc (bytes=472) at malloc.c:3056
3056    malloc.c: No such file or directory.
(gdb) bt
#0  0x0000ffffa4909848 in __GI___libc_malloc (bytes=472) at malloc.c:3056
#1  0x0000ffffa48f4058 in __fopen_internal (filename=0xffffa4a093c0 "/etc/passwd", mode=0xffffa4a09240 "rce", is32=1) at iofopen.c:58
#2  0x0000ffffa4a05ff8 in internal_setent (stream=0xfffff4062180) at nss_files/files-XXX.c:77
#3  0x0000ffffa4a06204 in _nss_files_getpwnam_r (name=0xc144111 "root", result=0x4c2a08 <resbuf>, buffer=0xc144220 "", buflen=1024, errnop=0xc13b720) at nss_files/files-pwd.c:32
#4  0x00000000004362b4 in getpwnam_r ()
#5  0x0000000000435f5c in getpwnam ()
#6  0x0000000000404d0c in th_get_uid (t=0xc143ff0) at libtar/decode.c:49
#7  0x0000000000403c0c in th_print_long_ls (t=0xc143ff0, f=0x4c0040 <_IO_2_1_stderr_>) at libtar/output.c:51
#8  0x00000000004039b0 in tar_extract_all (t=0xc143ff0, prefix=0xc13cb30 "/tmp/staticx-7eHn81") at libtar/extract.c:552
#9  0x0000000000401ac0 in extract_archive (dest_path=0xc13cb30 "/tmp/staticx-7eHn81") at bootloader/extract.c:229
#10 0x0000000000402950 in main (argc=1, argv=0xfffff40635f8) at bootloader/main.c:423

Deeming this as some type of heap corruption, I also tried valgrind but it runs successfully as well haha. I also tried building the bootloader with -fsanitize=address but my GCC was too old for that.

My suspicion is that something doesn't get emulated correctly and the bootloader is built with semi-correct settings (some parameters still fitting x86_64 and not Aarch64), which results in the corruption.

I might find some time to continue investigating this in the weekend, meanwhile if you have any suggestions where to continue searching, I'd be happy to hear them :)

For reference - I ran my tests basing on the image centos@sha256:43964203bf5d7fe38c6fca6166ac89e4c095e2b0c0a28f6c7c678a1348ddc7fa (installing everything for staticx + latest staticx from pypi)

Jongy commented 2 years ago

And from my comment in https://github.com/JonathonReinhart/staticx/issues/181#issuecomment-922392184:

Using this Dockerfile on a Linux xxx 5.8.0-1035-aws #37~20.04.1-Ubuntu SMP Tue Jun 1 09:52:32 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux machine creating a crashing binary:

# centos:7
FROM centos@sha256:43964203bf5d7fe38c6fca6166ac89e4c095e2b0c0a28f6c7c678a1348ddc7fa AS build-stage

RUN cat /etc/os-release;

RUN yum install -y gcc python3 curl python3-pip python3-devel

RUN python3 -m pip install wheel scons
RUN yum install -y glibc-static
RUN python3 -m pip install staticx

RUN yum install -y bzip2
RUN curl -o /tmp/patchelf-0.13.tar.bz2  -sSL https://github.com/NixOS/patchelf/releases/download/0.13/patchelf-0.13.tar.bz2
RUN yum install -y make
RUN yum install -y gcc-c++
RUN cd /tmp && tar -jxf patchelf-0.13.tar.bz2 && cd patchelf* && ./configure  --disable-dependency-tracking && make install

RUN echo 'void main() { printf("hello world!\n"); return 0; }' > a.c && gcc a.c -o a
RUN staticx a /a_static

FROM scratch AS export-stage

COPY --from=build-stage /a_static /a_static

while running all commands locally on that machine result in a working binary.

JonathonReinhart commented 2 years ago

Hi @Jongy, thanks for the detailed bug report. Sorry it took so long to get back to you.

I suspect this issue is resolved by #228, which was just released in v0.13.8. Your backtrace (thanks again!) is very similar to #227.

Please try it out and let me know!

Jongy commented 2 years ago

Thanks @JonathonReinhart ! Once I get to test it, I'll update here.

Jongy commented 1 year ago

Hi @JonathonReinhart , took me a while to get back to it but it seems that now another problems exists. These tests are performed on staticx 0.13.8.

Running the this Dockerfile on an Aarch64 machine works and produces a binary that runs on the Aarch64 system.

Running the same Dockerfile on an x86_64 system with emulation crashes during staticx

 => ERROR [build-stage 13/14] RUN staticx a /a_static                                                                                                                                    4.9s
------                                                                                                                                                                                        
 > [build-stage 13/14] RUN staticx a /a_static:
#16 4.890 staticx: Unexpected ldd error (1):
#16 4.890 
#16 4.890 qemu-aarch64-static: ./include/qemu/rcu.h:102: rcu_read_unlock: Assertion `p_rcu_reader->depth != 0' failed.
#16 4.890 /bin/ldd: line 115:    66 Segmentation fault      (core dumped) LD_TRACE_LOADED_OBJECTS=1 LD_WARN= LD_BIND_NOW= LD_VERBOSE= "$@"
------

Another staticx bug?

By the way, I looked back at what I did in https://github.com/Granulate/gprofiler/pull/355 which was my attempt at trying to solve the problem described in this ticket (the backtrace) and in https://github.com/JonathonReinhart/staticx/issues/227, I also encountered the getpwnam problem and solved it pretty much like you did :sweat_smile: just removed all calls to it.