exquo / signal-libs-build

Automatic compilation of native libraries for the Signal messenger
GNU General Public License v3.0
33 stars 9 forks source link

Support older glibc versions #21

Closed pbiering closed 3 months ago

pbiering commented 3 months ago

this updated requirement kills support for EL8 which itself has EOSL 2029-05 - was this intended?

ldd libsignal_jni.so
./libsignal_jni.so: /lib64/libm.so.6: version `GLIBC_2.29' not found (required by ./libsignal_jni.so)
    linux-vdso.so.1 (0x00007ffff3990000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007ffac97e6000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ffac95c6000)
    libm.so.6 => /lib64/libm.so.6 (0x00007ffac9244000)
    libdl.so.2 => /lib64/libdl.so.2 (0x00007ffac9040000)
    libc.so.6 => /lib64/libc.so.6 (0x00007ffac8c6a000)
    /lib64/ld-linux-x86-64.so.2 (0x00007ffaca3ec000)

Last "known-good": 0.52.0

Can this be circumvented and GLIBC 2.28 support can be kept over the next years?

This is related to https://github.com/AsamK/signal-cli/issues/1560

pbiering commented 3 months ago

Fixed now by receiving link to additional native builds: https://github.com/AsamK/signal-cli/issues/1560#issuecomment-2255066149

exquo commented 2 months ago

I have used a RHEL8-based image to build the x86_64 target. Here is the updated binary in the v0.52.2 release; it requires glibc v2.28.

Cross-compiling for ARM is less straightforward on RHEL than on Debian, as the glibc package for ARM architectures is not maintained in RHEL/Fedora's repos. I have tried instead to use the GCC toolchain downloaded from the ARM website directly. The armv7 version compiles (log), but aarch64 seems to get misidentified by CMake as a 32-bit architecture (log)..

Another tool useful for targeting older versions of glibc is the zig cc drop-in replacement for gcc / clang. It can cross-compile for different architectures out of the box, and allows to specify the desired glibc version explicitly, for instance:

zig cc foo.c -o foo -target x86_64-linux-gnu.2.28

A build for the x86_64-linux-gnu.2.17 target successfully produces the .so artifact. But I'm not sure whether it will actually run on distros with glibc 2.17: its dynamic symbols table (objdump -T) still references functions statx and getrandom (as weak symbols), which were introduced in glibc versions 2.28 and 2.25, respectively.

Cross-compiling using zig produced the same results as ARM's toolchain: armv7 builds successfully (logs), but aarch64 fails apparently trying to build for 32-bit system (logs).

For now I will leave cross-compilation to Debian 11's cross toolchains (requiring glibc v2.29), and for building the x86_64 target use a RHEL8-based image (with glibc v2.28).

m-ueberall commented 2 months ago

A build for the x86_64-linux-gnu.2.17 target successfully produces the .so artifact. But I'm not sure whether it will actually run on distros with glibc 2.17: its dynamic symbols table (objdump -T) still references functions statx and getrandom (as weak symbols), which were introduced in glibc versions 2.28 and 2.25, respectively.

Thanks a lot for the above observation which I totally missed–until earlier this week, I thought that by simply cross-compiling against older glibc versions, all required function calls would automatically be mapped or the build would fail. This is apparently not the case as neither existing build tests for newer glibc functions reliably produced errors in the past nor does cross-compilation check the satisfiability of weak symbols with respect to the target architecture/environment.

As you mentioned, it looks like it might be necessary to actually provide implementations for at least those two functions (not sure about the remaining undefined symbols below) or to check whether those functions are never called by signal-cli (which sounds even more challenging/impractical and which could also easily change later on):

# #### Ubuntu 13.04/Raring uses glibc 2.17; using older glibc versions should be pointless as GraalVM binaries refer to GLIBC_2.17 symbols:
# #### On the other hand, glibc 2.17 would (only) be needed for EL7 compatibility (but then, EL7 already reached its EOL on June 30th, 2024)
# grep GLIBC libsignal_jni_so0522_ubuntu1304_amd64.objdump | sed -e 's|.*(||g' -e 's|).*||' | sort -V | tail -1
GLIBC_2.16
# cat signal-cli0135_ubuntu1304_amd64.objdump | grep GLIBC|sed -e 's|.* ||g' | sort -V | tail -1
GLIBC_2.17

# grep 'UND' libsignal_jni_so0522_ubuntu2004_amd64.objdump |grep -vE 'GLIBC_|GCC_' >libsignal_jni_so0522_ubuntu2004_amd64.objdump.undefined
# grep 'UND' libsignal_jni_so0522_ubuntu1304_amd64.objdump |grep -vE 'GLIBC_|GCC_' >libsignal_jni_so0522_ubuntu1304_amd64.objdump.undefined
# diff libsignal_jni_so0522_ubuntu1304_amd64.objdump.undefined libsignal_jni_so0522_ubuntu2004_amd64.objdump.undefined | grep '^<'
< 0000000000000000  w   D  *UND*        0000000000000000  Base        statx
< 0000000000000000  w   D  *UND*        0000000000000000  Base        __cxa_thread_atexit_impl
< 0000000000000000  w   D  *UND*        0000000000000000  Base        getrandom
# grep -E 'statx|getrandom|__cxa_thread_atexit_impl' libsignal_jni_so0522_ubuntu2004_amd64.objdump          
0000000000000000  w   DF *UND*  0000000000000000 (GLIBC_2.28) statx                         <-- potential problem for Ubuntu 18.04/Bionic, 16.04/Xenial, 14.04/Trusty
0000000000000000  w   DF *UND*  0000000000000000 (GLIBC_2.18) __cxa_thread_atexit_impl
0000000000000000  w   DF *UND*  0000000000000000 (GLIBC_2.25) getrandom                     <-- potential problem for Ubuntu 16.04/Xenial, 14.04/Trusty
# cat libsignal_jni_so0522_ubuntu2004_amd64.objdump.undefined                               <-- not checked against older environments/Ubuntu versions yet
0000000000000000  w   D  *UND*  0000000000000000  Base        _ITM_deregisterTMCloneTable
0000000000000000  w   D  *UND*  0000000000000000  Base        OPENSSL_memory_get_size
0000000000000000  w   D  *UND*  0000000000000000  Base        OPENSSL_memory_alloc
0000000000000000  w   D  *UND*  0000000000000000  Base        __gmon_start__
0000000000000000  w   D  *UND*  0000000000000000  Base        sdallocx
0000000000000000  w   D  *UND*  0000000000000000  Base        OPENSSL_memory_free
0000000000000000  w   D  *UND*  0000000000000000  Base        _ITM_registerTMCloneTable
# 

With respect to the "not reliably produced errors" mentioned above, I ran into the following earlier this week (which suggests that at least some of the hundreds of dependencies/libraries used to build libsignal_jni.so don't thoroughly check whether certain functions are actually available and/or simply rely on "being built for modern environments"):

####FIXME#### gcc-13[.3.0] required for sparc64 (only!?)
####FIXME#### QUESTION: Why does the below error only show up exactly /once/ for 22 gcc+architecture build combinations, not 2 (gcc-1[34]+sparc64) or 11 (gcc-14) times?
####FIXME#### Using gcc-14[.2.0] for cross-build architectures amd64,arm64,armv7,i686,mips64el,mipsel,powerpc,ppc64,ppc64el,s390x (against Ubuntu 16.04 glibc) doesn't show the below problem:
####FIXME#### ./compile_native_signal-cli.sh "--no-refresh" "--debug" "--lib-only" "--lib-version" "0.54.2" "--lib-archs" "ubuntu1604:sparc64"
    Run Build Command(s): /usr/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile cmTC_d64dc/fast
    /usr/bin/gmake  -f CMakeFiles/cmTC_d64dc.dir/build.make CMakeFiles/cmTC_d64dc.dir/build
    gmake[1]: Entering directory '/tmp/libsignal_ubuntu2004-x86_64/target/sparc64-unknown-linux-gnu/release/build/boring-sys-8959fe11258ee08c/out/build/CMakeFiles/CMakeScratch/TryCompile-pg9ZLC'
    Building CXX object CMakeFiles/cmTC_d64dc.dir/testCXXCompiler.cxx.o
    /usr/bin/sparc64-linux-gnu-g++   -ffunction-sections -fdata-sections -fPIC  -o CMakeFiles/cmTC_d64dc.dir/testCXXCompiler.cxx.o -c /tmp/libsignal_ubuntu2004-x86_64/target/sparc64-unknown-linux-gnu/release/build/boring-sys-8959fe11258ee08c/out/build/CMakeFiles/CMakeScratch/TryCompile-pg9ZLC/testCXXCompiler.cxx
    /usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_d64dc.dir/link.txt --verbose=1
    /usr/bin/sparc64-linux-gnu-g++  -ffunction-sections -fdata-sections -fPIC  CMakeFiles/cmTC_d64dc.dir/testCXXCompiler.cxx.o -o cmTC_d64dc
    /usr/lib/gcc-cross/sparc64-linux-gnu/14/../../../../sparc64-linux-gnu/bin/ld: /usr/lib/gcc-cross/sparc64-linux-gnu/14/libstdc++.so: undefined reference to `getentropy@GLIBC_2.25'
    collect2: error: ld returned 1 exit status
    gmake[1]: *** [CMakeFiles/cmTC_d64dc.dir/build.make:99: cmTC_d64dc] Error 1
exquo commented 2 months ago

@m-ueberall The linker should indeed abort if a referenced symbol is not defined in one of the modules.[^*] The fact that a symbol is marked as "weak" indicates that it may be not mandatory. Looking specifically at the three weak symbols here:

Based on the (circumstantial) evidence above, it might be reasonable to expect the glibc-2.17 build to run okay on systems with glibc 2.17.

To test it, it should be possible to write a short program that would call the relevant functions from the shared library. It would also help to compile the .so with the debugging info preserved. Tracing the functions usage in signal-cli with a call graph / control flow diagram would probably be too complicated to check in all cases. N.B.: I'm no in expert in this whole area..

[^*]: The linker error in the last code block in your comment (compiling for sparc64) produces such an error. Why it does not happen for other architectures, I can only guess.. Since the libsignal_jni.so does not reference glibc's getentropy function, and this error happens when CMake tries to compile testCXXCompiler.cxx, this might also be a quirk of CMake.

m-ueberall commented 2 months ago

Based on the (circumstantial) evidence above, it might be reasonable to expect the glibc-2.17 build to run okay on systems with glibc 2.17.

@exquo: Yes, after some more extensive digging I came to the same conclusion; assuming I haven't missed some heavily obfuscated calls that escaped the numerous local regex searches (not that there should be any to begin with!), libsignal's/the underlying libraries' exception handling should shield the user from problems, even on systems using very old Linux kernels.

While it still puzzles me that libsignal_jni.so includes the statx, getrandom symbols when the underlying libraries only actually use matching (architecture-specific) syscalls (using varying "magic constants"), looking at the complete sources for the respective targets (Ubuntu 18.04/16.04/13.04 C[++] headers/libraries, Rust crate sources, BoringSSL sources), it can be seen that they all include a respective fallback for the three symbols/routine references in question:

Given the findings above, it should be safe to remove the warning about undefined symbols from the signal-cli wiki topic and to keep the cross-building procedure for older versions of Ubuntu as-is – I'll just reference this post in the change comment.

m-ueberall commented 2 months ago
  1. The linker error in the last code block in your comment (compiling for sparc64) produces such an error. Why it does not happen for other architectures, I can only guess.. Since the libsignal_jni.so does not reference glibc's getentropy function, and this error happens when CMake tries to compile testCXXCompiler.cxx, this might also be a quirk of CMake.

@exquo: Lest I forget – this is the next TODO on the local list; earlier this week, the local build script gained the necessary basic functionality to also deal with "unwanted/superfluous tests" that address functionality not really used by the application/artefact to be built (in this case, libsignal_jni.so) by intercepting, logging (and modifying) all calls to respective compilers/linkers. (While this was originally done to address another generic problem and there may be ways to simply tell CMake to skip certain tests, it might come in handy and save some time with respect to changes that couldn't be easily sent upstream anyway.)

exquo commented 2 months ago

About building on old (unsupported) distros: it would be good to know what specific security issues might be associated with that. For instance, even if there is a known bug in the old glibc version on the build system, it should not be a problem if the resulting .so is used on a system with an updated glibc. Many pieces of the software used in the build (rust, protobuf) use the versions newer than in the distro's repos anyway. And the packages that are (typically) from the repos (gcc, cmake, git, …) are unlikely to have the kind of bugs that could be relevant to the security of the build. I'm sure there might be other concerns that I'm not thinking about right now. But on the first brush, it's not immediately evident what problems arise from building on an unsupported distro version.

m-ueberall commented 1 month ago

@exquo: Sorry, I simply forgot to reply to this earlier.

You're right that in this case,

makes it rather unlikely that security issues might arise. However, even in this scenario, there might be cases where components have been compromised and/or contain errors that are only identified later on; my remark was meant to address the general case.

Two arbitrary examples:

Again, in case of libsignal_jni.so, which is really tiny, the above examples might not apply (at least not for well-known/-used architectures like amd64, arm64 and suitable distros that offer reliable development/release cycles), but they could; especially when (cross-)building for a larger number of less common architectures, chances are that you run into potential risks due to lack of maintenance unless you actively monitor architecture-dependent changes upstream (again, arguably not possible) or choose build environments based on distros/releases that are still under active maintenance (to ensure that someone (else) at least tries to keep up with potential security issues).