rakshasa / libtorrent

libTorrent BitTorrent library
http://rtorrent.net/downloads/
GNU General Public License v2.0
883 stars 210 forks source link

libtorrent.so 0.13.8 crash on Rasberry PI4 kernel due to unaligned access #244

Open MetalKnight opened 1 year ago

MetalKnight commented 1 year ago

libtorrent 0.13.8, distributed with rtorrent 0.9.8 raspberry PI4 kernel version: 6.1.21-v8+ aarch64 GNU/Linux

rtorrent runs on a Raspberry PI 4 with storage configured to ext4 HDD. was running fine for ages, I've updated the PI4 to the latest update yesterday and now rtorrent crashes after downloading some data from a torrent. restarting the process makes it crash the sameway after the file chcksum has been completed

Caught SIGBUS, dumping stack:
rtorrent() [0x1fdf0]
/lib/arm-linux-gnueabihf/libc.so.6(__default_rt_sa_restorer+0) [0xf7493910]
/usr/lib/arm-linux-gnueabihf/libtorrent.so.21(+0xd3af4) [0xf792aaf4]
/usr/lib/arm-linux-gnueabihf/libtorrent.so.21(_ZN7torrent9PollEPoll7performEv+0xe0) [0xf7894984]
/usr/lib/arm-linux-gnueabihf/libtorrent.so.21(_ZN7torrent9PollEPoll7do_pollExi+0x78) [0xf7894adc]
/usr/lib/arm-linux-gnueabihf/libtorrent.so.21(_ZN7torrent11thread_base10event_loopEPS0_+0x174) [0xf78d189c]
rtorrent() [0x1e660]
/lib/arm-linux-gnueabihf/libc.so.6(__libc_start_main+0x114) [0xf747b740]

Error: Success
Signal code '1': Invalid address alignment.
Fault address: 0x110946d
The fault address is not part of any chunk.

using gdb gets the following information that shows that the issue is indeed in Thread 1 running libtorrent.so

(gdb) thread apply all backtrace

Thread 3 (Thread 0xf5dff200 (LWP 1536) "rtorrent scgi"):
#0  0xf7a7a1dc in epoll_wait (epfd=13, events=0x1f6750, maxevents=1024, timeout=600001) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0xf7dce898 in torrent::PollEPoll::poll(int) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#2  0xf7dceac8 in torrent::PollEPoll::do_poll(long long, int) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#3  0xf7e0b89c in torrent::thread_base::event_loop(torrent::thread_base*) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#4  0xf7af8310 in start_thread (arg=0xf5dff200) at pthread_create.c:477
#5  0xf7a79da8 in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:73 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 2 (Thread 0xf6770200 (LWP 1535) "rtorrent disk"):
#0  0xf7a7a1dc in epoll_wait (epfd=10, events=0x1f01b8, maxevents=1024, timeout=10001) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0xf7dce898 in torrent::PollEPoll::poll(int) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#2  0xf7dceac8 in torrent::PollEPoll::do_poll(long long, int) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#3  0xf7e0b89c in torrent::thread_base::event_loop(torrent::thread_base*) () fro--Type <RET> for more, q to quit, c to continue without paging--
m /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#4  0xf7af8310 in start_thread (arg=0xf6770200) at pthread_create.c:477
#5  0xf7a79da8 in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:73 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 1 (Thread 0xf6c2b040 (LWP 1531) "rtorrent main"):
#0  0xf7e64af4 in ?? () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#1  0xf7dce984 in torrent::PollEPoll::perform() () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#2  0xf7dceadc in torrent::PollEPoll::do_poll(long long, int) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#3  0xf7e0b89c in torrent::thread_base::event_loop(torrent::thread_base*) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#4  0x0001e660 in ?? ()
#5  0xf79b5740 in __libc_start_main (main=0xfffef6e4, argc=-139534336, argv=0xf79b5740 <__libc_start_main+276>, init=<optimized out>, fini=0x170f98, rtld_fini=0xf7fcd510 <_dl_fini>, stack_end=0xfffef6e4) at libc-start.c:308
#6  0x0001f2c0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
MetalKnight commented 1 year ago

awaiting instructions on how to provide you further debug info. thanks

MetalKnight commented 1 year ago

kernel alignment options

sudo modprobe configs
zgrep ALIGN /proc/config.gz
# CONFIG_COMPAT_ALIGNMENT_FIXUPS is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
CONFIG_CMA_ALIGNMENT=8
# CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B is not set
MetalKnight commented 1 year ago

some debug thoughts taken from https://github.com/epics-base/pvDataCPP/issues/84

MetalKnight commented 1 year ago

I've found a workaround:

I can revert to the old binary anytime if you still need me to help with debug the unaligned issue.

stickz commented 6 months ago

Thanks so much for reporting this information. I would like to add information here. This also happens on x86 - not just ARM. I can confirm that configuring rakshasa/libtorrent with --enable-aligned resolves the problem.

../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.
(gdb) bt
#0  __memmove_avx_unaligned () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:222
#1  0x00007ffff7dfd269 in torrent::Chunk::to_buffer(void*, unsigned int, unsigned int) () from /lib/libtorrent.so.21
#2  0x00007ffff7e2ad0c in torrent::PeerConnectionBase::up_chunk() () from /lib/libtorrent.so.21
#3  0x00007ffff7e2eb3a in torrent::PeerConnection<(torrent::Download::ConnectionType)1>::event_write() ()
   from /lib/libtorrent.so.21
#4  0x00007ffff7dd0848 in torrent::PollEPoll::perform() () from /lib/libtorrent.so.21
#5  0x00007ffff7df3b62 in torrent::thread_base::event_loop(torrent::thread_base*) () from /lib/libtorrent.so.21
#6  0x000055555558e8c4 in main ()