exquo / signal-libs-build

Automatic compilation of native libraries for the Signal messenger
GNU General Public License v3.0
32 stars 9 forks source link

Build configuration for Linux/AMD64/musl, i.e. OpenWRT, Alpine Linux #19

Closed kevemueller closed 7 months ago

kevemueller commented 8 months ago

I would like to use a plugin which pulls in libsignal_jni.so distributed here. Unfortunately the dynamic linking fails as your build is for AMD64/glibc (the most common variant), whereas my platform OpenWRT uses AMD64/musl. A more prominent example would be Alpine Linux.

Would it be possible to add AMD64/musl to the list of targets you provide builds for? The rust target triple is: x86_64-unknown-linux-musl

exquo commented 8 months ago

I'll look into it!

For reference, here are some discussions of linking libsignal statically / with musl:

mentioning workarounds for running glibc-linked programs on Alpine:

kevemueller commented 8 months ago

Hi exquo, thanks for picking this up. The pointers for running glibc on Alpine are known, but are not really applicable (as mentioned I am running OpenWRT) and I am using a Java plugin that depends on libsignal_jni.so. Containers, flatpak, etc. are out of reach on the hardware.

The JavaVM is OpenJDK built against musl (Azul packaged) and works like a charm. I strongly doubt that a compatibility stub, like alpine-pkg-glibc would work in this setup (if it existed on OpenWRT).

A statically linked is definitely a way to go. By dropping in the build artefact posted by @morph027 I could make my plugin work. So this may be closed in favour of a statically linked libsignal_jni.so which is a 500k penalty (8% increase) as opposed to the dynamically linked one. The dynamically linked remains as a nice to have.

exquo commented 7 months ago

@kevemueller Could you please try this musl build artifact? (Produced in this workflow run)

exquo commented 7 months ago

I have added the x86_64-unknown-linux-musl target to the latest release, libsignal v0.45.0.

Note that the resulting .so file is a dynamically-linked library, so the musl libc must be present on your system (as it is on Alpine, OpenWRT, etc).

The full listing of the dependent libraries:

$ ldd libsignal_jni.so
/lib/ld-musl-x86_64.so.1 (0x7f841cbf3000)
libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x7f841cbcf000)
libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f841cbf3000)

contains libgcc_s.so.1 (not to be confused with glibc). If your system does not provide it by default, you can install it - for instance, on Alpine it is in libgcc package.

I have also tried building a statically-linked object, but the boring-ssl step fails (log). Likewise for cross-compilation, e.g. aarch64 (log).

So the musl builds here are dynamically linked, and only target the x86_64 architecture.

Also, there exist signal-cli packages for Alpine (x86_64 and aarch64).

kevemueller commented 7 months ago

Hi @exquo, thanks for the excellent work. I can confirm that this build works on OpenWRT/amd64. The ldd output is slightly different and indicates a mismatch, yet the runtime behaviour is ok:

# ldd libsignal_jni.so
        ldd (0x7f6ad2a13000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x7f6ad23dd000)
        libc.musl-x86_64.so.1 => ldd (0x7f6ad2a13000)

Thank you again for fixing this!

m-ueberall commented 7 months ago

@exquo @morph027: I tried to quickly reproduce the above on amd64/arm64 hosts, and while the first attempt failed, the attached notes (if the resulting artefacts work as expected) might be of interest to you with respect to compatibility: CHECKME_musl_crt-static.log

It turns out that by simply using a rust *-unknown-linux-musl target/platform along with -Ctarget-feature=-crt-static and the "default" gcc/g++ compiler, you end up with dynamic libraries that still require glibc, but don't contain calls to newer functions. So instead of v2.29, you'd only need v2.25, released on 2017-02-05 – which should work on e.g. Ubuntu Bionic; since the GraalVM binaries themselves and the embedded dynamic sqlite library combined only require glibc v2.17(amd64)/v2.9(arm64), rebuilding them based on those "alternative" libsignal_jni.so libraries would also allow them to work with older Linux distributions without further manual modification. (I haven't tried whether you can use gcc as a linker instead of clang yet.)

morph027 commented 7 months ago

@exquo @morph027: I tried to quickly reproduce the above on amd64/arm64 hosts, and while the first attempt failed, the attached notes (if the resulting artefacts work as expected) might be of interest to you with respect to compatibility: CHECKME_musl_crt-static.log

It turns out that by simply using a rust *-unknown-linux-musl target/platform along with -Ctarget-feature=-crt-static and the "default" gcc/g++ compiler, you end up with dynamic libraries that still require glibc, but don't contain calls to newer functions. So instead of v2.29, you'd only need v2.25, released on 2017-02-05 – which should work on e.g. Ubuntu Bionic; since the GraalVM binaries themselves and the embedded dynamic sqlite library combined only require glibc v2.17(amd64)/v2.9(arm64), rebuilding them based on those "alternative" libsignal_jni.so libraries would also allow them to work with older Linux distributions without further manual modification. (I haven't tried whether you can use gcc as a linker instead of clang yet.)

Uh, thats good to read, will try this later. Makes the build toolchain a bit more flexible as there is no need to stick to old distribution releases for building.

m-ueberall commented 7 months ago

Uh, thats good to read, will try this later. Makes the build toolchain a bit more flexible as there is no need to stick to old distribution releases for building.

For the record: Make that "stick to much older distribution releases"–I did a test a couple of weeks ago, and it showed that if you use Ubuntu Jammy (22.04) instead of Ubuntu Focal (20.04) to build libsignal_jni.so, you more or less "automatically" end up with newer glibc function calls. Not sure whether the same applies in the above setting, though–maybe I'll try that during the next days.

morph027 commented 7 months ago

Ah, okay. Right now, i'm building with docker image rust:buster which has 2.28

exquo commented 7 months ago

@m-ueberall Thanks for sharing the results! Good to keep the glibc dependency version (reasonably) low.

However, it is strange that a build for target *-unknown-linux-musl links against any glibc at all. For the purpose of the current ticket, that would be a bug, as the goal is to have a binary free of glibc references. But if due to some compilation alchemy it produces a binary that works on older glibc-based distros, that's handy too.

exquo commented 7 months ago

@kevemueller 👍 The ldd outputs differ because Alpine and OpenWRT have the required libraries in different locations. In the output the important entries are to the left of =>, i.e. libgcc_s.so.1 and libc.musl-x86_64.so.1: these are the shared libraries used by the resulting libsignal_jni.so object.

morph027 commented 7 months ago

Resulting artifact did not work for me...

2024-04-27T09:21:30.851Z [main] WARN  org.asamk.signal.manager.Manager - Failed to call libsignal-client: Can't load library: /tmp/10337173321067881312libsignal_jni.so
Missing required native library dependency: libsignal-client
exquo commented 7 months ago

@morph027 I'm using rust:buster image too (for glibc-based builds; the musl one is built on Alpine). buster should reach EOL by this July. Building on bullseye bumps the required glibc version to 2.29.

m-ueberall commented 7 months ago

Resulting artifact did not work for me...

Yes, unfortunately, when I tested this locally tonight, it also wouldn't work (same error message), regardless whether you use clang or gcc as the linker. If I had to guess, I'd say that the resulting .so is at least missing references to external libraries like libpthread. Funny that I don't see any error messages during the respective build processes (library, GraalVM binary)–but then, maybe I just cannot identify them. I wonder whether it's worth to report this upstream (Rust) … but first, I'd like to be able to actually create a .so based on musl in order to see "exactly" what I'm missing in the local environment at the moment. (While also finding out why libsignal v0.45 builds for sparc64 when v0.45.1 does not despite the fact that the non-working crate should not have been touched.)

m-ueberall commented 7 months ago

Ok, after a few hours of modifying the local (glibc-based) build environment and reading through a large number of issues, I nearly managed to reproduce the current musl build of version 0.45.1 in this repository which is built on GitHub using an Alpine container with musl-dev v1.2.4_git20230717-r4:

# musl-ldd ./exquo_signal-libs-build__libsignal_jni.so  
        musl-ldd (0x7f784757e000)
        libgcc_s.so.1 => ./libgcc_s.so.1 (0x7f7846bb0000)
        libc.musl-x86_64.so.1 => musl-ldd (0x7f784757e000)
# musl-ldd ./libsignal_jni_so0451_ubuntu2004_amd64-musl 
        musl-ldd (0x7f4827c5d000)
        libgcc_s.so.1 => ./libgcc_s.so.1 (0x7f4827280000)
        libc.so => musl-ldd (0x7f4827c5d000)
Error relocating ./libsignal_jni_so0451_ubuntu2004_amd64-musl: __memcpy_chk: symbol not found
Error relocating ./libsignal_jni_so0451_ubuntu2004_amd64-musl: __memset_chk: symbol not found
#

The remaining errors are due to the fact that you really need v1.2.4 (or newer) of a complete musl toolchain (which needs to be built from scratch locally using, e.g., https://github.com/richfelker/musl-cross-make with respect to the dynamic libraries). gcc nowadays uses _FORTIFY_SOURCE on Debian/Ubuntu and musl simply does not contain *_chk counterparts (see https://github.com/briansmith/ring/issues/409).

In summary, if you're cross-compiling on a glibc based host, you're currently facing three problems (two of which @exquo managed to dodge with the Alpine build container):

  1. You need to find the correct setup (simply specifying the $(arch)-unknown-linux-musl target does not work as seen in my previous comment) based on, e.g., the pointers found here: https://github.com/rust-lang/wg-cargo-std-aware/issues/76#issuecomment-977996458
    • I'm currently using the following (you also need a musl-gcc.specs.modified which actually does the heavy lifting and is passed to the "standard" $(arch)-linux-gnu-gcc[-NN] compiler by the $(arch)-linux-musl-gcc.modified shell script; both modifications are directly derived from the original files contained in musl-dev.deb by replacing the paths of your musl toolchain, see the notes below):
      CC="x86_64-linux-musl-gcc.modified"; CXX="x86_64-linux-gnu-g++"
      CARGO_LINKER="${CC}"
      CARGO_RUSTFLAGS='["-Ctarget-feature=-crt-static"]'
  2. The rust toolchain "somehow" introduces a dependency to libgcc_s.so which is said to be unneeded and unwanted
    • see https://github.com/rust-lang/rust/issues/82521#issuecomment-786093169
    • Due to the above, installing the musl-dev package on Debian/Ubuntu is not sufficient, because the above dynamic library is missing; simply picking up an existing toolchain archive from http://musl.cc/ (all of which are rather old) alongside {musl,musl-dev,musl-tools}.deb regardless whether you use v1.2.4 or v1.2.5 of the latteralso does not work (this combination is what produced the above artefact we inspected using musl-ldd) would work, though
  3. Unless you match the configuration inside the Alpine build container (containing a complete musl v1.2.4 toolchain or newer which does not make use of _FORTIFY_SOURCE), you're facing the above missing symbols problem
    • see https://github.com/sgerrand/alpine-pkg-glibc/issues/176 (Edit: this link might look a bit misleading as it's referring to glibc v2.34 vs. v2.35 which is not present on Ubuntu Focal to begin with–it indirectly shows that using an outdated/incomplete musl toolchain is known not to work libraries compiled using _FORTIFY_SOURCE pose a problem)
m-ueberall commented 2 months ago

Short update: Working on the "musl TODO" again after a couple of weeks, I came across this comment which suggested to link against gcc_eh instead of gcc_s by using a wrapper (CARGO_LINKER=".../any-linux-musl-gcc.wrapper") similar to the one shown below:

#!/bin/bash
#### see https://github.com/rust-lang/rust/issues/29527#issuecomment-931874664
for arg do
    shift
    case "${arg}" in
        -lgcc_s) ;;
        *) set -- "$@" "${arg}" ;;
    esac
done
exec "${CC}" "$@" -lgcc_eh -lc

The result looks promising, although the size of the resulting libsignal_jni.so is rather large when compared with the "native" Alpine build:

# ls -l libsignal_jni*
-rwxr-xr-x 1 sys-maint adm  13454512 Sep 26 22:40 libsignal_jni.so-v0.58.1-x86_64-unknown-linux-musl
-rw-rw---- 1 sys-maint adm    567868 Sep 27 12:18 libsignal_jni.so-v0.58.1-x86_64-unknown-linux-musl.log
-rwxrwxr-x 1 sys-maint adm 184670992 Sep 27 12:41 libsignal_jni_so0581_ubuntu2004_amd64-musl
-rwxrwxr-x 1 sys-maint adm  26236760 Sep 27 12:43 libsignal_jni_so0581_ubuntu2004_amd64-musl.stripped
# file libsignal_jni.so-v0.58.1-x86_64-unknown-linux-musl libsignal_jni_so0581_ubuntu2004_amd64-musl*
libsignal_jni.so-v0.58.1-x86_64-unknown-linux-musl:  ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=5f3f571124f1b53328f96ddc76962d454e83810f, not stripped
libsignal_jni_so0581_ubuntu2004_amd64-musl:          ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, with debug_info, not stripped
libsignal_jni_so0581_ubuntu2004_amd64-musl.stripped: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, stripped
# musl-ldd ./libsignal_jni_so0581_ubuntu2004_amd64-musl
        musl-ldd (0x7f88d6c51000)
        libc.so => musl-ldd (0x7f88d6c51000)
# 

Interestingly, while the .so built on Alpine has a proper ELF header which can also be parsed by the default ldd on glibc-based distros like Ubuntu, one sees a number of missing symbols (e.g., __cpu_indicator_init); since one usually wouldn't use musl-based libraries with glibc-based environments, this shouldn't be that important, though.

Now, inside an Alpine 3.20 container (NB in the following, we're looking at v0.58.0 which is needed by the current signal-cli release 0.13.7):

alpine-320-amd64:~# echo $(grep -E '^NAME=|^VERSION_ID=' /etc/os-release)
NAME="Alpine Linux" VERSION_ID=3.20.3
alpine-320-amd64:~# ls -l libsignal_jni*
-rwxr-xr-x    1 root     root      13378664 Sep 18 22:40 libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl
-rwxrwxr-x    1 root     root     185273400 Sep 30 11:21 libsignal_jni_so0580_ubuntu2004_amd64-musl
-rwxrwxr-x    1 root     root      26154872 Sep 30 11:22 libsignal_jni_so0580_ubuntu2004_amd64-musl.stripped
alpine-320-amd64:~# ldd libsignal_jni_so0580_ubuntu2004_amd64-musl.stripped
        /lib/ld-musl-x86_64.so.1 (0x7f5850aa6000)
        libc.so => /lib/ld-musl-x86_64.so.1 (0x7f5850aa6000)
alpine-320-amd64:~# ldd libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl
        /lib/ld-musl-x86_64.so.1 (0x7f00ddfe9000)
Error loading shared library libgcc_s.so.1: No such file or directory (needed by libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl)
        libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f00ddfe9000)
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_Resume: symbol not found
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_Backtrace: symbol not found
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_GetIPInfo: symbol not found
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_RaiseException: symbol not found
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_SetGR: symbol not found
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_GetDataRelBase: symbol not found
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_FindEnclosingFunction: symbol not found
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_GetIP: symbol not found
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_GetLanguageSpecificData: symbol not found
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_GetTextRelBase: symbol not found
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_DeleteException: symbol not found
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_GetRegionStart: symbol not found
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_SetIP: symbol not found
Error relocating libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl: _Unwind_GetCFA: symbol not found
alpine-320-amd64:~# apk add libgcc
(1/1) Installing libgcc (13.2.1_git20240309-r0)
OK: 201 MiB in 40 packages
alpine-320-amd64:~# ldd libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl
        /lib/ld-musl-x86_64.so.1 (0x7f06553b8000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x7f0654945000)
        libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f06553b8000)
alpine-320-amd64:~# apk del libgcc
(1/1) Purging libgcc (13.2.1_git20240309-r0)
OK: 201 MiB in 39 packages

The simplest test run that accesses the embedded dynamic library looks like it's working (below, both libsignal-client-0.58.0.jar.exquo and libsignal-client-0.58.0.jar.projektzentrisch only contain their respective renamed libsignal_jni_amd64.so):

alpine-320-amd64:~# echo $(grep -E '^NAME=|^VERSION_ID=' /etc/os-release)
NAME="Alpine Linux" VERSION_ID=3.20.3
alpine-320-amd64:~# ls -l libsignal_jni*
-rwxr-xr-x    1 root     root      13378664 Sep 18 22:40 libsignal_jni.so-v0.58.0-x86_64-unknown-linux-musl
-rwxrwxr-x    1 root     root     185273400 Sep 30 11:21 libsignal_jni_so0580_ubuntu2004_amd64-musl
-rwxrwxr-x    1 root     root      26154872 Sep 30 11:22 libsignal_jni_so0580_ubuntu2004_amd64-musl.stripped
alpine-320-amd64:~# ls -l /opt/signal-cli-0.13.7/lib/libsignal-client*
lrwxrwxrwx    1 root     root            44 Sep 30 10:30 /opt/signal-cli-0.13.7/lib/libsignal-client-0.58.0.jar -> libsignal-client-0.58.0.jar.exquo
-rw-r--r--    1 root     root       5579159 Sep 18 22:40 /opt/signal-cli-0.13.7/lib/libsignal-client-0.58.0.jar.exquo
-rw-r--r--    1 root     root       8735760 Sep 30 11:22 /opt/signal-cli-0.13.7/lib/libsignal-client-0.58.0.jar.projektzentrisch
-rw-r--r--    1 root     root      42730661 Jan  2  1970 /opt/signal-cli-0.13.7/lib/libsignal-client-0.58.0.jar.upstream
alpine-320-amd64:~# signal-cli daemon --dbus
WARN  Manager - Failed to call libsignal-client: /tmp/libsignal5938144787961118960/libsignal_jni_amd64.so: Error loading shared library libgcc_s.so.1: No such file or directory (needed by /tmp/libsignal593>
Missing required native library dependency: libsignal-client
alpine-320-amd64:~# apk add libgcc
(1/1) Installing libgcc (13.2.1_git20240309-r0)
OK: 201 MiB in 40 packages
alpine-320-amd64:~# signal-cli daemon --dbus
INFO  DaemonCommand - Starting daemon in multi-account mode
[...]
alpine-320-amd64:~#
alpine-320-amd64:~# apk del libgcc
(1/1) Purging libgcc (13.2.1_git20240309-r0)
OK: 201 MiB in 39 packages
alpine-320-amd64:~# ls -l /opt/signal-cli-0.13.7/lib/libsignal-client*
lrwxrwxrwx    1 root     root            44 Sep 30 11:30 /opt/signal-cli-0.13.7/lib/libsignal-client-0.58.0.jar -> libsignal-client-0.58.0.jar.projektzentrisch
-rw-r--r--    1 root     root       5579159 Sep 18 22:40 /opt/signal-cli-0.13.7/lib/libsignal-client-0.58.0.jar.exquo
-rw-r--r--    1 root     root       8735760 Sep 30 11:22 /opt/signal-cli-0.13.7/lib/libsignal-client-0.58.0.jar.projektzentrisch
-rw-r--r--    1 root     root      42730661 Jan  2  1970 /opt/signal-cli-0.13.7/lib/libsignal-client-0.58.0.jar.upstream
alpine-320-amd64:~# signal-cli daemon --dbus
INFO  DaemonCommand - Starting daemon in multi-account mode
[...]
alpine-320-amd64:~#

According to earlier remarks, the use of a Rust-native unwinder (https://github.com/nbdd0121/unwinding) and musl-clang instead of musl-gcc might be beneficial with respect to both size and the generation of stackdumps, but this will require additional work …