arvidn / libtorrent

an efficient feature complete C++ bittorrent implementation
http://libtorrent.org
Other
5.16k stars 994 forks source link

Segmentation fault when built with cxx14 #6567

Closed graywolf closed 2 years ago

graywolf commented 2 years ago

Please provide the following information

libtorrent version (or branch): 1.2.14

platform/architecture: linux/amd64 + musl-c (1.2.2) (alpine linux 3.15)

compiler and compiler version: gcc 10.3.1

please describe what symptom you see, what you would expect to see instead and how to reproduce it.

When libtorrent-rasterbar is built with cxx14, it segfaults on adding new torrents (well, for me it was on deluge startup, but repro is easier with adding). This started to happen once I've updated to alpine 3.15, so new musl-c version could have some role in this (speculation on my part). In cxx17 it works fine. I've managed to put together reasonably short reproduction [0].

0: https://git.sr.ht/~graywolf/libtorrent-rasterbar-segfault-repro

arvidn commented 2 years ago

I'm not familiar with the tools you use to build libtorrent with. I suspect this is an ABI incompatibility issue somewhere. But could you provide a way to reproduce this using one of the supported build systems? i.e. boost-build or cmake.

Using other build systems runs the risk of causing ABI incompatibility issues. Are you building libtorrent and your test program from source with the same configuration?

graywolf commented 2 years ago

The build script for creating the packages uses boost-build inside, however I understand the question and will try to produce something more minimal over the weekend.

userdocs commented 2 years ago

I am not sure this is a libtorrent issue and it's a bit vague. What i do have a docker build script to quickly juggle certain features and options by setting them in the command, which should easily provide the required testing setup needed for this issue.

# boost_v= set the boost version using just 74/75/76/77
# build_d= set the build directory - default is lt-build relative to the container /root
# install_d= set the completed directory based of the build dir name
# libtorrent_b= set the libtorrent branch to use - default is RC_2_0
# cxxstd= set the cxx standard 11/14/17 - default is 17
# libtorrent= built libtorrent yes/no - default is yes
# python_b= build the python binding yes/no - default is yes
# python_v= set the python version 2/3 - default is 3
# crypto= set wolfssl as alternative to openssl (default)
# lto= null is off and the default - set to on to use lto
# system_crypto= use system libs [yes] or git latest release [no]

Build command boost 77 libtorrent RC_1_2 cxx 14 python binding only

The binding is installed relative to the docker root which why it will work with the test command after

docker run -it -w /root -v ~/build:/root alpine:latest /bin/ash -c \
'apk add bash curl ncurses \
&& curl -sL git.io/JXDOJ | bash -s \
boost_v=77 build_d= libtorrent_b=RC_1_2 cxxstd=14 libtorrent=no python_b=yes python_v=3 lto= crypto= system_crypto=yes'

Test command:

docker run -it -w /root -p 8112:8112 -v ~/build:/root alpine:latest /bin/ash -c \
'apk add deluge \
&& deluged \
&& deluge-web \
&& ash'

From my testing, using cxx standard 14 with the python binding and deluge 2.0.3 via python 3 deluged crashes after adding a new torrent. It can be fixed by building a cxx standard 17 binding.

But i know for qbittorrent that they increased the minimum version to 17 with release 4.3.3 so it would not work against newer versions anyway.

Thursday January 19th 2021 - qBittorrent v4.3.3 release
v4.3.3 changelog:
...
OTHER: Bump project requirement to C++17 (Chocobo1)

I'm not 100% sure what the OP is doing or how since qbittorrent-nox is in Alpine edge testing and GCC 10.3 is in latest stable main and edge main is using gcc 11.2

https://pkgs.alpinelinux.org/packages?name=gcc&branch=v3.15 = 10.3.1 https://pkgs.alpinelinux.org/packages?name=gcc&branch=edge = 11.2.1 https://pkgs.alpinelinux.org/packages?name=qbittorrent-nox&branch=edge = 4.3.9

arvidn commented 2 years ago

so, the main library is built with a different c++ version than the python binding, is that right?

graywolf commented 2 years ago

I'm not sure how qbittorrent and it being in edge/testing is even relevant. I don't use it, so I'm not sure what is the connection here.

Anyway, I've managed to get a reliable reproduction using just libtorrent-rasterbar being compiled from source and like 20 lines of c++. So no python bindings or third party python program using them. Once it finishes running (so I'm actually sure it is correct, this is first time using libtorrent-rasterbar directly for me, so it took a bit of time), I will post it in another comment.

userdocs commented 2 years ago

It's relevant because you have not been specific about what you are doing or which clients are being used. You originally said

well, for me it was on deluge startup

Which means you need to have the python binding. If you are not using the binding qbittorrent is the only other client available. If you had provided more info I wouldn't have to guess.

graywolf commented 2 years ago

Ah, sorry about that then, in the reproduction repository was https://git.sr.ht/~graywolf/libtorrent-rasterbar-segfault-repro/tree/master/item/repro , I though that is reasonably clear. Anyway I will post another using just libtorrent-rasterbar (no python) in short time.

userdocs commented 2 years ago

@arvidn i used the simple client and got this (using cxx 14 for the binding)

~/lt-build/libtorrent/bindings/python # ./simple_client.py
Traceback (most recent call last):
  File "/root/lt-build/libtorrent/bindings/python/./simple_client.py", line 13, in <module>
    info = lt.torrent_info(sys.argv[1])
IndexError: list index out of range
Segmentation fault (core dumped)

Are there any other tests i can do like this to pin point it?

arvidn commented 2 years ago

you could pin-point it by loading up that core file in gdb.

I'm fairly confident this is an ABI issue rooted in the way the main library and the python bindings are built. So understanding those two things I think is the most important part.

arvidn commented 2 years ago

when building libtorrent and the python bindings from source using boost-build (which will ensure they are ABI compatible) it works fine:

cd bindings/python
b2 stage_dependencies boost-link=shared stage_module
LD_LIBRARY_PATH=./dependencies/ python simple_client.py <torrent-file>
arvidn commented 2 years ago

I'm fairly confident the problem is this:

  1. libtorrent (the main library) is built in C++11 mode and "installed"
  2. the python bindings are built in C++14 mode linking against the installed main library
  3. in C++11 mode, the entry type will use a work-around type to allow lookups by string_view in its dictionary, whereas in C++14 and later, it will actually use the transparent comparator support that was added to std::map in C++14. This causes an ABI incompatibility.

If you want to build the python bindings in C++14 or later mode, and link against the main library built in C++11 mode, you have to define TORRENT_CXX11_ABI. It was mentioned here in the changelog when it was introduced.

graywolf commented 2 years ago

So here is the new, more minimal reproduction: https://git.sr.ht/~graywolf/libtorrent-rasterbar-repro-6567

In the repository is whole example how to reproduce it, including building libtorrent-rasterbar from release tarball inside a docker image. Hopefully it should be reproducible. Architecture (mine, where it crashed) is amd64.

Using this example:

#include <chrono>
#include <iostream>
#include <thread>
#include <vector>

#include <libtorrent/alert_types.hpp>
#include <libtorrent/magnet_uri.hpp>
#include <libtorrent/session.hpp>

#define ADD_MAGNET(uri) do { \
    total++; \
    lt::add_torrent_params atp = lt::parse_magnet_uri(uri); \
    atp.save_path = "."; \
    ses.add_torrent(std::move(atp)); \
} while(0)

int main() try {
    lt::settings_pack p;
    p.set_int(lt::settings_pack::alert_mask,
        lt::alert_category::status | lt::alert_category::error);

    lt::session ses(p);

    int done = 0, total = 0;

    /* Archlinux iso image, legal to download. */
    ADD_MAGNET("magnet:?xt=urn:btih:49bd6ee35e815507c0404b49a2de1242378cb0c6");

    for (;;) {
        std::vector<lt::alert*> alerts;
        ses.pop_alerts(&alerts);

        for (lt::alert const* a : alerts) {
            std::cout << a->message() << std::endl;
            if (lt::alert_cast<lt::torrent_finished_alert>(a)) {
                done++;
            }
        }

        if (done == total) { break; }

        std::this_thread::sleep_for(std::chrono::milliseconds(200));
    }
    std::cout << "done, shutting down" << std::endl;
}
catch (std::exception& e) {
    std::cerr << "Error: " << e.what() << std::endl;
}

c++14:

[..]
added torrent: 49bd6ee35e815507c0404b49a2de1242378cb0c6
49bd6ee35e815507c0404b49a2de1242378cb0c6 added
49bd6ee35e815507c0404b49a2de1242378cb0c6: state changed to: dl metadata
external IP received: XXX.XXX.XXX.XXX
49bd6ee35e815507c0404b49a2de1242378cb0c6 resumed
archlinux-2021.11.01-x86_64.iso metadata successfully received
archlinux-2021.11.01-x86_64.iso: state changed to: checking (r)
archlinux-2021.11.01-x86_64.iso: state changed to: downloading
archlinux-2021.11.01-x86_64.iso checked
Segmentation fault (core dumped)
make: *** [Makefile:6: bad] Error 139

c++17:

[..]
added torrent: 49bd6ee35e815507c0404b49a2de1242378cb0c6
49bd6ee35e815507c0404b49a2de1242378cb0c6 added
49bd6ee35e815507c0404b49a2de1242378cb0c6: state changed to: dl metadata
external IP received: XXX.XXX.XXX.XXX
49bd6ee35e815507c0404b49a2de1242378cb0c6 resumed
archlinux-2021.11.01-x86_64.iso metadata successfully received
archlinux-2021.11.01-x86_64.iso: state changed to: checking (r)
archlinux-2021.11.01-x86_64.iso: state changed to: downloading
archlinux-2021.11.01-x86_64.iso checked
archlinux-2021.11.01-x86_64.iso hash for piece 1198 failed
could not map port using UPnP: no router found
could not map port using UPnP: no router found
archlinux-2021.11.01-x86_64.iso: state changed to: finished
archlinux-2021.11.01-x86_64.iso torrent finished downloading
archlinux-2021.11.01-x86_64.iso: state changed to: seeding
done, shutting down
userdocs commented 2 years ago

Loading the torrent in the simple client

~ # python3 lt-build/libtorrent/bindings/python/simple_client.py ubuntu-20.iso.torrent
starting ubuntu-20.04.3-live-server-amd64.iso
0.00% complete (down: 0.0 kB/s up: 0.0 kB/s peers: 4) downloading ubuntu-20.04.3-live-server-amd64.iso (https://torrent.ubuntu.com/announce)[127.0.0.1:6881] skipping tracker announce (unreachable) "" (1)
ubuntu-20.04.3-live-server-amd64.iso (https://ipv6.torrent.ubuntu.com/announce)[127.0.0.1:6881] Host not found (authoritative) "" (1)
Segmentation fault (core dumped)

GDB with the core dump

~ # gdb python3 core
GNU gdb (GDB) 11.1
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-alpine-linux-musl".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python3...
Reading symbols from /usr/lib/debug//usr/bin/python3.9.debug...

warning: core file may not match specified executable file.
[New LWP 12]
[New LWP 11]
[New LWP 13]
[New LWP 14]
Core was generated by `python3 lt-build/libtorrent/bindings/python/simple_client.py ubuntu-20.iso.torr'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fe7cb1d6859 in ?? () from /lib/ld-musl-x86_64.so.1
[Current thread is 1 (LWP 12)]

Alpine is not super helpful here.

userdocs commented 2 years ago

This is the best i could get, does it help at all?

(gdb) run lt-build/libtorrent/bindings/python/simple_client.py ubuntu-20.iso.torrent
Starting program: /usr/bin/python3 lt-build/libtorrent/bindings/python/simple_client.py ubuntu-20.iso.torrent
warning: Error disabling address space randomization: Operation not permitted
[New LWP 55]
[New LWP 56]
starting ubuntu-20.04.3-desktop-amd64.iso
[New LWP 57]
0.00% complete (down: 0.0 kB/s up: 0.0 kB/s peers: 0) checking_resume_data
Thread 2 "python3" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 55]
0x00007ff65ccb2859 in ?? () from /lib/ld-musl-x86_64.so.1
(gdb) backtrace
#0  0x00007ff65ccb2859 in ?? () from /lib/ld-musl-x86_64.so.1
#1  0x00007ff65ccb2c2e in ?? () from /lib/ld-musl-x86_64.so.1
#2  0x0000000000000000 in ?? ()
(gdb)
arvidn commented 2 years ago

could either of you confirm or or reject my hypothesis? https://github.com/arvidn/libtorrent/issues/6567#issuecomment-980668556

userdocs commented 2 years ago

I am building it myself in dockers so there is no cxx11 version in my env.

arvidn commented 2 years ago

so, the main libtorrent library is also built by you with -std=c++14 then?

graywolf commented 2 years ago

Since my second reproduction is completely without python binding, I don't think that is a cause.

userdocs commented 2 years ago

in the example I gave above I'm not building the main library, just the python binding, but if I enable the main library then they use whatever standard I set via b2 cxxstd="14" for both builds.

I think this is musl related perhaps but I cannot see the build logs to know how they built it.

userdocs commented 2 years ago

how about this?

(gdb) run lt-build/libtorrent/bindings/python/simple_client.py ubuntu-20.iso.torrent
Starting program: /usr/bin/python3 lt-build/libtorrent/bindings/python/simple_client.py ubuntu-20.iso.torrent
warning: Error disabling address space randomization: Operation not permitted
[New LWP 24]
[New LWP 25]
[New LWP 26]
starting ubuntu-20.04.3-desktop-amd64.iso
0.00% complete (down: 0.0 kB/s up: 0.0 kB/s peers: 15) downloading ubuntu-20.04.3-desktop-amd64.iso (https://torrent.ubuntu.com/announce)[127.0.0.1:6881] skipping tracker announce (unreachable) "" (1)
ubuntu-20.04.3-desktop-amd64.iso (https://ipv6.torrent.ubuntu.com/announce)[127.0.0.1:6881] Host not found (authoritative) "" (1)
0.00% complete (down: 1.0 kB/s up: 1.6 kB/s peers: 15) downloading
Thread 2 "python3" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 24]
get_meta (p=p@entry=0x6eb60000000000 <error: Cannot access memory at address 0x6eb60000000000>) at src/malloc/mallocng/meta.h:132
132     src/malloc/mallocng/meta.h: No such file or directory.
(gdb) backtrace
#0  get_meta (p=p@entry=0x6eb60000000000 <error: Cannot access memory at address 0x6eb60000000000>) at src/malloc/mallocng/meta.h:132
#1  0x00007fcd285a5c2e in __libc_free (p=0x6eb60000000000) at src/malloc/mallocng/free.c:105
#2  0x00007fcd285a53c2 in free (p=<optimized out>) at src/malloc/free.c:5
#3  0x00007fcd27a36081 in boost::alignment::aligned_free (ptr=0x7fcd273aee10) at /root/lt-build/boost_1_77_0/boost/align/detail/aligned_alloc.hpp:45
#4  boost::asio::aligned_delete (ptr=0x7fcd273aee10) at /root/lt-build/boost_1_77_0/boost/asio/detail/memory.hpp:124
#5  boost::asio::detail::thread_info_base::deallocate<boost::asio::detail::thread_info_base::default_tag> (size=<optimized out>, pointer=<optimized out>, this_thread=<optimized out>) at /root/lt-build/boost_1_77_0/boost/asio/detail/thread_info_base.hpp:198
#6  boost::asio::detail::thread_info_base::deallocate (size=<optimized out>, pointer=<optimized out>, this_thread=<optimized out>) at /root/lt-build/boost_1_77_0/boost/asio/detail/thread_info_base.hpp:129
#7  boost::asio::asio_handler_deallocate (size=40, pointer=0x7fcd273aee10) at /root/lt-build/boost_1_77_0/boost/asio/impl/handler_alloc_hook.ipp:51
#8  boost_asio_handler_alloc_helpers::deallocate<libtorrent::aux::session_impl::deferred_submit_jobs()::<lambda()> > (h=..., s=40, p=0x7fcd273aee10) at /root/lt-build/boost_1_77_0/boost/asio/detail/handler_alloc_helpers.hpp:95
#9  boost::asio::detail::hook_allocator<libtorrent::aux::session_impl::deferred_submit_jobs()::<lambda()>, boost::asio::detail::completion_handler<libtorrent::aux::session_impl::deferred_submit_jobs()::<lambda()>, boost::asio::io_context::basic_executor_type<std::allocator<void>, 0> > >::deallocate (n=1,
    p=0x7fcd273aee10, this=<synthetic pointer>) at /root/lt-build/boost_1_77_0/boost/asio/detail/handler_alloc_helpers.hpp:137
#10 boost::asio::detail::completion_handler<libtorrent::aux::session_impl::deferred_submit_jobs()::<lambda()>, boost::asio::io_context::basic_executor_type<std::allocator<void>, 0> >::ptr::reset (this=0x7fcd2730e8a0) at /root/lt-build/boost_1_77_0/boost/asio/detail/completion_handler.hpp:35
#11 boost::asio::detail::completion_handler<libtorrent::aux::session_impl::deferred_submit_jobs()::<lambda()>, boost::asio::io_context::basic_executor_type<std::allocator<void>, 0> >::do_complete(void *, boost::asio::detail::operation *, const boost::system::error_code &, std::size_t) (owner=0x7fcd274aed70,
    base=0x7fcd273aee10) at /root/lt-build/boost_1_77_0/boost/asio/detail/completion_handler.hpp:67
#12 0x00007fcd2799ea49 in boost::asio::detail::scheduler_operation::complete (bytes_transferred=0, ec=..., owner=0x7fcd274aed70, this=0x7fcd273aee10) at /root/lt-build/boost_1_77_0/boost/asio/detail/scheduler_operation.hpp:40
#13 boost::asio::detail::scheduler::do_run_one (ec=..., this_thread=..., lock=..., this=0x7fcd274aed70) at /root/lt-build/boost_1_77_0/boost/asio/detail/impl/scheduler.ipp:486
#14 boost::asio::detail::scheduler::run (this=0x7fcd274aed70, ec=...) at /root/lt-build/boost_1_77_0/boost/asio/detail/impl/scheduler.ipp:204
#15 0x00007fcd279f861e in boost::asio::io_context::run (this=<optimized out>) at /root/lt-build/boost_1_77_0/boost/asio/impl/io_context.ipp:63
#16 operator() (__closure=<optimized out>) at ../../src/session.cpp:363
#17 std::__invoke_impl<void, libtorrent::session::start(libtorrent::session_handle::session_flags_t, libtorrent::session_params&&, libtorrent::io_service*)::<lambda()> > (__f=...) at /usr/include/c++/10.3.1/bits/invoke.h:60
#18 std::__invoke<libtorrent::session::start(libtorrent::session_handle::session_flags_t, libtorrent::session_params&&, libtorrent::io_service*)::<lambda()> > (__fn=...) at /usr/include/c++/10.3.1/bits/invoke.h:95
#19 std::thread::_Invoker<std::tuple<libtorrent::session::start(libtorrent::session_handle::session_flags_t, libtorrent::session_params&&, libtorrent::io_service*)::<lambda()> > >::_M_invoke<0> (this=<optimized out>) at /usr/include/c++/10.3.1/thread:264
#20 std::thread::_Invoker<std::tuple<libtorrent::session::start(libtorrent::session_handle::session_flags_t, libtorrent::session_params&&, libtorrent::io_service*)::<lambda()> > >::operator() (this=<optimized out>) at /usr/include/c++/10.3.1/thread:271
#21 std::thread::_State_impl<std::thread::_Invoker<std::tuple<libtorrent::session::start(libtorrent::session_handle::session_flags_t, libtorrent::session_params&&, libtorrent::io_service*)::<lambda()> > > >::_M_run(void) (this=<optimized out>) at /usr/include/c++/10.3.1/thread:215
#22 0x00007fcd276576e9 in ?? () from /usr/lib/libstdc++.so.6
#23 0x00007fcd285d5221 in start (p=0x7fcd2730f0d8) at src/thread/pthread_create.c:203
#24 0x00007fcd285d73e0 in __clone () at src/thread/x86_64/clone.s:22
Backtrace stopped: frame did not save the PC
arvidn commented 2 years ago

yeah, that does look related to musl rather than ABI. Is there an easy way for me to use musl on ubuntu?

graywolf commented 2 years ago

Easiest would likely be docker, alpine:3.15, that is what I'm using in my reproduction.

userdocs commented 2 years ago

I was using this (i had not used musl-dbg before and this is important here)

docker run -it -w /root -v ~/build_latest:/root alpine:latest /bin/ash -c \
'apk add binutils python3 python3-dbg python3-dev gdb libexecinfo musl-dbg \
&& ash'
userdocs commented 2 years ago

This is what i am doing so i think it will work for you.

You just need the torrent file ubuntu-20.iso.torrent

docker run -it -w /root -v ~/build_latest:/root alpine:latest /bin/ash -c \
'apk add bash curl ncurses \
&& curl -sL git.io/JXDOJ | bash -s \
boost_v=77 build_d= libtorrent_b=RC_1_2 cxxstd=14 libtorrent=no python_b=yes python_v=3 lto= crypto= system_crypto=yes debug_symbols=on'
docker run -it -w /root -v ~/build_latest:/root alpine:latest /bin/ash -c \
'apk add binutils python3 python3-dbg python3-dev gdb libexecinfo musl-dbg \
&& ash'
gdb python3
add-symbol-file .local/lib/python3.9/site-packages/libtorrent.so
run lt-build/libtorrent/bindings/python/simple_client.py ubuntu-20.iso.torrent
stdjs commented 2 years ago

yeah, that does look related to musl rather than ABI. Is there an easy way for me to use musl on ubuntu?

You can download toolchain from http://musl.cc/

userdocs commented 2 years ago

I'm not sure the musl cross toolchains are the right tool for the job here tbh.

Ubuntu can install musl but you need hirsute to get the matching version to Alpine stable

https://packages.ubuntu.com/hirsute/musl

Also if you are using WSL2 someone made an Alpine image https://www.microsoft.com/en-gb/p/alpine-wsl/9p804crf0395

But i think the docker method is the easiest and best way.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

graywolf commented 2 years ago

I've opened 6831 since this is still an issue.