clangen / musikcube

a cross-platform, terminal-based music player, audio engine, metadata indexer, and server in c++
https://musikcube.com
BSD 3-Clause "New" or "Revised" License
4.18k stars 295 forks source link

SIGSEGV (Address boundary error) Gentoo linking libtinfo instead of libtinfow #610

Open Arniiiii opened 1 year ago

Arniiiii commented 1 year ago

how it looks: I wrote an ebuild for gentoo. I emerged (read configured, compiled and installed ) musikcube successfully.

then I'm typing musikcube: and I got this:

# musikcube 
fish: Job 1, 'musikcube' terminated by signal SIGSEGV (Address boundary error)

in gdb with debug compilation flag ( -O3 -pipe -march=znver2 -g ) :

Reading symbols from /usr/share/musikcube/musikcube...
Reading symbols from /usr/lib/debug//usr/share/musikcube/musikcube.debug...
(gdb) r
Starting program: /usr/share/musikcube/musikcube 
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff6f436c0 (LWP 21959)]
[New Thread 0x7fffe9f316c0 (LWP 21960)]
[New Thread 0x7fffe97306c0 (LWP 21961)]
[New Thread 0x7fffe8f2f6c0 (LWP 21962)]
[New Thread 0x7fffe3fff6c0 (LWP 21963)]
[New Thread 0x7fffe37fe6c0 (LWP 21964)]
[New Thread 0x7fffdbfff6c0 (LWP 21965)]
[Thread 0x7fffdbfff6c0 (LWP 21965) exited]
[Thread 0x7fffe3fff6c0 (LWP 21963) exited]

Thread 1 "musikcube" received signal SIGSEGV, Segmentation fault.
0x00007ffff7be330f in termattrs_sp () from /lib64/libncursesw.so.6

Idk what is the problem.

clangen commented 1 year ago

Which terminal emulator are you using? Does it crash if you start using TERM=xterm-256color musikcube?

Arniiiii commented 1 year ago

Which terminal emulator are you using? Does it crash if you start using TERM=xterm-256color musikcube?

I tried in plasma x11 and wayland, in kitty and in konsole, in raw tty, with this env var and without. same result. idk why. If you know what can I test or do to get a clue, reply, I'll try to do so.

Arniiiii commented 1 year ago

maybe this can help somehow: image

Arniiiii commented 1 year ago

after some further debugging i found out the problem happens somewhere here: 289 line ( musikcube 3.0.1 ) image

Arniiiii commented 1 year ago

after learning some more about debugging, I tried build ncurses with debug info and trace-meaningful info. after strace -f -o ./trace.log /usr/share/musikcube/musikcube in log I got this file :

trace.log

clangen commented 1 year ago

Is it possible you have two different ncurses libraries on your machine, are linking against one, but at runtime a different one is getting loaded? For example, I want to say on some version of Debian, at some point, there were different versions of ncurses in /lib versus /usr/lib or something -- I remember dealing with a problem similar to this in the past.

Does the pre-compiled version supplied in the release page work? This version includes a the copy of libncurses it links against to avoid this sort of issue. https://github.com/clangen/musikcube/releases/download/3.0.1/musikcube_linux_x86_64_3.0.1.tar.bz2

Arniiiii commented 1 year ago
~ [SIGSEGV]> lddtree /usr/share/musikcube/musikcube
/usr/share/musikcube/musikcube (interpreter => /lib64/ld-linux-x86-64.so.2)
    libcurl.so.4 => /usr/lib64/libcurl.so.4
        libcares.so.2 => /usr/lib64/libcares.so.2
        libnghttp2.so.14 => /usr/lib64/libnghttp2.so.14
        libssl.so.1.1 => /usr/lib64/libssl.so.1.1
        libbrotlidec.so.1 => /usr/lib64/libbrotlidec.so.1
            libbrotlicommon.so.1 => /usr/lib64/libbrotlicommon.so.1
        libz.so.1 => /lib64/libz.so.1
    libcrypto.so.1.1 => /usr/lib64/libcrypto.so.1.1
    libncursesw.so.6 => /lib64/libncursesw.so.6
        libtinfow.so.6 => /lib64/libtinfow.so.6
    libpanelw.so.6 => /usr/lib64/libpanelw.so.6
    libtinfo.so.6 => /lib64/libtinfo.so.6
    libmusikcore.so => /usr/share/musikcube/libmusikcore.so
    libstdc++.so.6 => /usr/lib/gcc/x86_64-pc-linux-gnu/12/libstdc++.so.6
    libm.so.6 => /lib64/libm.so.6
    libgcc_s.so.1 => /usr/lib/gcc/x86_64-pc-linux-gnu/12/libgcc_s.so.1
    libc.so.6 => /lib64/libc.so.6

Is it possible you have two different ncurses libraries on your machine, are linking against one, but at runtime a different one is getting loaded? For example, I want to say on some version of Debian, at some point, there were different versions of ncurses in /lib versus /usr/lib or something -- I remember dealing with a problem similar to this in the past.

Does the pre-compiled version supplied in the release page work? This version includes a the copy of libncurses it links against to avoid this sort of issue. https://github.com/clangen/musikcube/releases/download/3.0.1/musikcube_linux_x86_64_3.0.1.tar.bz2

Arniiiii commented 1 year ago

maybe it's about libtinfo and libtinfow, i'm digging into what there's at internet about ncurses and gentoo problems.

Arniiiii commented 1 year ago

after full debug compilation of ncurses: image image

Arniiiii commented 1 year ago

https://bugs.gentoo.org/692954

this works: LD_PRELOAD='/lib64/libtinfow.so.6' /usr/share/musikcube/musikcube

clangen commented 1 year ago

Oh, wow, interesting find! It looks like this is part of a larger problem, given it affects other apps. However, after reading through the comments it's not clear to me if this affects all Gentoo users, or just ones who have compiled/configured ncurses a particular way... do you have clarity around that?

If this is a general problem we can probably figure out how to detect Gentoo during the build process and ensure we link against the right library, or modify the start script to include the LD_PRELOAD flag.

Arniiiii commented 1 year ago

Oh, wow, interesting find! It looks like this is part of a larger problem, given it affects other apps. However, after reading through the comments it's not clear to me if this affects all Gentoo users, or just ones who have compiled/configured ncurses a particular way... do you have clarity around that?

If this is a general problem we can probably figure out how to detect Gentoo during the build process and ensure we link against the right library, or modify the start script to include the LD_PRELOAD flag.

I'm rebuilding all packages that depend on ncurses. Tomorrow I'll check if this helped. if not, I may be able make musikcube be started with the flag for gentoo users on installing.

Arniiiii commented 1 year ago

ok, can you explain in which place in code the shell script at /usr/bin/musikcube appears? or is it cmake stuff? or in other distro there's no such thing?

clangen commented 1 year ago

cmake generates this file when processing musikcube/CMakeLists.txt using musikcube.in, found here:

So you can define a new variable in cmake land, like set(musikcube_LD_PRELOAD "LD_PRELOAD=foo"), and then tweak musikcube.in to look something like this:

#!/bin/sh

set -eu

cd "@musikcube_INSTALL_DIR@"/share/musikcube/
exec LD_PRELOAD="@musikcube_LD_PRELOAD" ./musikcube "$@"

Note if you get this working, you we follow the same steps for musikcubed

Arniiiii commented 1 year ago

I made a patch and an ebuild for all this. My patch just adds LD_PRELOAD="libtinfow.so.6" , and I assume there's no ncurses5 on gentoo, so I guess it will work.

For someone on gentoo: https://github.com/Gerodote/ex_repo/tree/master

You said that I need to patch something for musikcubed , but idk what it even does. i can run it, looks like it works but idk how to test properly. If you could explain, I can patch it.

zhuyifei1999 commented 1 year ago

(Reddit sent me here) I'm testing the ebuild.

The problem happens here: https://github.com/clangen/musikcube/blob/b4035271319588624fb4425b4b2e317f1ec723cd/src/musikcube/CMakeLists.txt#L116-L118

And we see in the configure logs:

-- [ncurses] using library names with 'w' prefix
-- [ncurses] not Darwin! will attempt to link against libtinfo
-- [musikcube] using libtinfo at: /usr/lib64/libtinfo.so
-- [musikcube] using libncurses at: /usr/lib64/libncursesw.so
-- [musikcube] using libpanel at: /usr/lib64/libpanelw.so

And you end up linking against both libtinfo and libtinfow:

/usr/share/musikcube $ ldd ./musikcube
    linux-vdso.so.1 (0x00007ffd256f0000)
    libcurl.so.4 => /usr/lib64/libcurl.so.4 (0x00007f6f45f83000)
    libcrypto.so.3 => /usr/lib64/libcrypto.so.3 (0x00007f6f45600000)
    libncursesw.so.6 => /usr/lib64/libncursesw.so.6 (0x00007f6f45f3d000)
    libpanelw.so.6 => /usr/lib64/libpanelw.so.6 (0x00007f6f45f37000)
    libtinfo.so.6 => /usr/lib64/libtinfo.so.6 (0x00007f6f45eee000)
    libmusikcore.so => /usr/share/musikcube/./libmusikcore.so (0x00007f6f45000000)
    libstdc++.so.6 => /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6 (0x00007f6f44c00000)
    libm.so.6 => /usr/lib64/libm.so.6 (0x00007f6f45b23000)
    libgcc_s.so.1 => /usr/lib/gcc/x86_64-pc-linux-gnu/13/libgcc_s.so.1 (0x00007f6f45ec7000)
    libc.so.6 => /usr/lib64/libc.so.6 (0x00007f6f44a27000)
    libcares.so.2 => /usr/lib64/libcares.so.2 (0x00007f6f45ead000)
    libnghttp2.so.14 => /usr/lib64/libnghttp2.so.14 (0x00007f6f45af2000)
    libssl.so.3 => /usr/lib64/libssl.so.3 (0x00007f6f45548000)
    libz.so.1 => /usr/lib64/libz.so.1 (0x00007f6f45ad1000)
    libtinfow.so.6 => /usr/lib64/libtinfow.so.6 (0x00007f6f454fe000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f6f4605f000)

libtinfo because -ltinfo in link flags, and libtinfow because dependency from libncursesw:

$ ldd /usr/lib64/libncursesw.so.6
    linux-vdso.so.1 (0x00007ffc067f6000)
    libc.so.6 => /usr/lib64/libc.so.6 (0x00007fdde6ab1000)
    libtinfow.so.6 => /usr/lib64/libtinfow.so.6 (0x00007fdde6a67000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fdde6cf9000)

So I tested this patch:

diff --git a/src/musikcube/CMakeLists.txt b/src/musikcube/CMakeLists.txt
index e16ec8b4..8e7d0f2c 100644
--- a/src/musikcube/CMakeLists.txt
+++ b/src/musikcube/CMakeLists.txt
@@ -96,10 +96,12 @@ if ((${DISABLE_WIDE_NCURSES_LIB_SUFFIXES} MATCHES "true") OR ((APPLE) AND (${ENA
     message(STATUS "[ncurses] using library names *WITHOUT* 'w' prefix")
     set(CURSES_LIBRARY_NAME ncurses)
     set(PANEL_LIBRARY_NAME panel)
+    set(TINFO_LIBRARY_NAME tinfo)
 else()
     message(STATUS "[ncurses] using library names with 'w' prefix")
     set(CURSES_LIBRARY_NAME ncursesw)
     set(PANEL_LIBRARY_NAME panelw)
+    set(TINFO_LIBRARY_NAME tinfow)
 endif()

 if (APPLE)
@@ -114,7 +116,7 @@ else()
         set(LIBTINFO "")
     else()
         message(STATUS "[ncurses] not Darwin! will attempt to link against libtinfo")
-        find_library(LIBTINFO NAMES tinfo)
+        find_library(LIBTINFO NAMES ${TINFO_LIBRARY_NAME})
         message(STATUS "[musikcube] using libtinfo at: " ${LIBTINFO})
     endif()
 endif()

Result:

/usr/share/musikcube $ ldd ./musikcube
    linux-vdso.so.1 (0x00007fffc1be7000)
    libcurl.so.4 => /usr/lib64/libcurl.so.4 (0x00007f68cc4bb000)
    libcrypto.so.3 => /usr/lib64/libcrypto.so.3 (0x00007f68cbc00000)
    libncursesw.so.6 => /usr/lib64/libncursesw.so.6 (0x00007f68cc1ba000)
    libpanelw.so.6 => /usr/lib64/libpanelw.so.6 (0x00007f68cc4b5000)
    libtinfow.so.6 => /usr/lib64/libtinfow.so.6 (0x00007f68cc170000)
    libmusikcore.so => /usr/share/musikcube/./libmusikcore.so (0x00007f68cb600000)
    libstdc++.so.6 => /usr/lib/gcc/x86_64-pc-linux-gnu/13/libstdc++.so.6 (0x00007f68cb200000)
    libm.so.6 => /usr/lib64/libm.so.6 (0x00007f68cbb23000)
    libgcc_s.so.1 => /usr/lib/gcc/x86_64-pc-linux-gnu/13/libgcc_s.so.1 (0x00007f68cc14b000)
    libc.so.6 => /usr/lib64/libc.so.6 (0x00007f68cb027000)
    libcares.so.2 => /usr/lib64/libcares.so.2 (0x00007f68cc499000)
    libnghttp2.so.14 => /usr/lib64/libnghttp2.so.14 (0x00007f68cc11a000)
    libssl.so.3 => /usr/lib64/libssl.so.3 (0x00007f68cb548000)
    libz.so.1 => /usr/lib64/libz.so.1 (0x00007f68cc0f9000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f68cc597000)

No more double-linking and musikcube starts up fine without segfaults.

So I next check if this patch will break other systems, say, Debian. Surprisingly, Debian's libncursesw links against libtinfo and libtinfow does not exist.

Then I check how exactly Gentoo's and Debian's configure of ncurses differ.

Gentoo builds with --with-termlib (use_with tinfo termlib means if USE flag tinfo is enabled, expand to --with-termlib): https://github.com/gentoo/gentoo/blob/f2e4bbc44ec1f90456cf73a2c96daffe607a2a4f/sys-libs/ncurses/ncurses-6.4_p20230527.ebuild#L253

Whereas Debian builds with --with-termlib=tinfo: https://packages.debian.org/buster/libncursesw6 -> ncurses_6.1+20181013-2+deb10u3.debian.tar.xz -> /debian/rules -> CONFARGS = [...] --with-termlib=tinfo

And then I read ncurses source: https://github.com/mirror/ncurses/blob/87c2c84cbd2332d6d94b12a1dcaf12ad1a51a938/INSTALL#L1248-L1253

    --with-termlib[=XXX]
[...]
    If an option value is given, that overrides the name of the terminfo
    library.  For instance, if the wide-character version is built, the
    terminfo library would be named libtinfow.  But the libtinfow interface
    is upward compatible from libtinfo, so it would be possible to overlay
    libtinfo.so with a "wide" version of libtinfow.so by renaming it with
    this option.

So if I understand this correctly, Debian went with the "overlay" where it always uses the wide version in libtinfo, whereas Gentoo keep both versions separate.

I think it might be a good idea to either perform a detection via ncursesw6-config --libs (which would tell you the link flags needed), or attempt look for libtinfow first, and if that fails, look for libtinfo.

Oh btw, @Gerodote would you mind if I send the ebuild to GURU?

Arniiiii commented 1 year ago

@zhuyifei1999 of course, do that.

zhuyifei1999 commented 1 year ago

Done https://github.com/gentoo/guru/commit/a3a749f80dbc78199c2526e5692374b3fd882e41 :wink:

clangen commented 1 year ago

Sorry, my ability to work on musikcube sort of ebbs and flows with real life responsibilities... I've been away for a few weeks, but it looks like this issue has been resolved and the ebuild is working fine? Closing for now, but feel free to re-open.

zhuyifei1999 commented 1 year ago

I'll reopen because we are performing this patch downstream in order to workaround this: https://github.com/gentoo/guru/blob/fb3679d5071a1b10333c964b121eab0223cfb803/media-sound/musikcube/files/musikcube-3.0.1-tinfow.patch

Ideally we don't need this patch and someone can just directly build from source on a Gentoo system by cloning this repo.

zhuyifei1999 commented 1 year ago

(uh, I don't have a reopen button)

clangen commented 1 year ago

Ah, makes sense. Re-opening.