Closed toofar closed 1 year ago
Hm. At a glance, without firing up a debugger, I'm thinking that jemalloc's malloc isn't reentrant, and we're calling back into it at a time that it wasn't prepared for. I'll try to dig into it in a day or so...
Hm. At a glance, without firing up a debugger, I'm thinking that jemalloc's malloc isn't reentrant, and we're calling back into it at a time that it wasn't prepared for. I'll try to dig into it in a day or so...
We may want to skip libunwind from the list of stuff we patch
Yeah, it seems that jemalloc is indeed not reentrant: https://github.com/jemalloc/jemalloc/issues/501
On the other hand, there is a bunch of things that I don't like here:
Thread 3 (Thread 0x7f09d24236c0 (LWP 18456) "python3"):
#0 futex_wait (private=0, expected=2, futex_word=0x7f09e86d5700) at ../sysdeps/nptl/futex-internal.h:146
#1 __GI___lll_lock_wait (futex=futex@entry=0x7f09e86d5700, private=0) at ./nptl/lowlevellock.c:49
#2 0x00007f09e8382262 in lll_mutex_lock_optimized (mutex=0x7f09e86d5700) at ./nptl/pthread_mutex_lock.c:48
#3 ___pthread_mutex_lock (mutex=0x7f09e86d5700) at ./nptl/pthread_mutex_lock.c:93
#4 0x00007f09e86923a0 in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#5 0x00007f09e8621944 in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#6 0x00007f09e8621b8f in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#7 0x00007f09e862259d in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#8 0x00007f09e86addf9 in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#9 0x00007f09e6eedaea in memray::tracking_api::Tracker::prepareNativeTrace (trace=std::optional [no contained value]) at src/memray/_memray/tracking_api.h:237
#10 0x00007f09e6eeed11 in memray::tracking_api::Tracker::trackAllocation (func=memray::hooks::Allocator::MMAP, size=2097152, ptr=0x7f09de600000) at src/memray/_memray/tracking_api.h:218
#11 memray::intercept::mmap (addr=, length=2097152, prot=, flags=, fd=, offset=) at s--Type for more, q to quit, c to continue without paging--c
rc/memray/_memray/hooks.cpp:224
#12 0x00007f09e8694b5a in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#13 0x00007f09e8694bc2 in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#14 0x00007f09e8689994 in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#15 0x00007f09e863bc89 in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#16 0x00007f09e863cc5f in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#17 0x00007f09e863769c in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#18 0x00007f09e8621878 in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#19 0x00007f09e86abc5a in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#20 0x00007f09e86abef8 in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#21 0x00007f09e86ad926 in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#22 0x00007f09e8622162 in ?? () from /usr/lib/x86_64-linux-gnu/libjemalloc.so.2
#23 0x00007f09e6eee4c5 in memray::hooks::SymbolHook::operator()(unsigned long) const (this=0x7f09e6f6c6c0 ) at src/memray/_memray/hooks.h:100
#24 memray::intercept::malloc (size=168) at src/memray/_memray/hooks.cpp:169
One is that we are both tracking malloc
and the underlying mmap
. The other is that I can see our friend __tls_get_addr
in the stack but i don't think that's giving problems but now it makes me suspicious.
Well, seems that I cannot debug this on my aarch64 laptop 😓
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
0xaaaabe679b9b ???
0xaaaabe9bcefb ???
0xaaaabe9bd0e3 ???
0xaaaabeb64873 ???
0xaaaabe976ebb ???
0xaaaabe8cf49b ???
0xaaaabe5ca2bf ???
0xaaaabe971a37 ???
0xaaaabe971f47 ???
0xaaaabe971ff7 ???
0xaaaabe8782d7 ???
0xffff914f5387 _td_fetch_value
./nptl_db/fetch-value.c:115
0xffff914f230f td_ta_map_lwp2thr
./nptl_db/td_ta_map_lwp2thr.c:194
0xaaaabe807d8f ???
0xaaaabe809347 ???
0xaaaabe973e73 ???
0xaaaabe7d081f ???
0xaaaabe7dc033 ???
0xaaaabeb64d03 ???
0xaaaabeb657f7 ???
0xaaaabe980787 ???
0xaaaabe980a6b ???
0xaaaabe818dcb ???
0xaaaabe818f1b ???
0xaaaabe81acff ???
0xaaaabe81b733 ???
0xaaaabe5c1183 ???
0xffff9883777f __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
0xffff98837857 __libc_start_main_impl
../csu/libc-start.c:381
0xaaaabe5c73af ???
0xffffffffffffffff ???
---------------------
/build/gdb-yCDzia/gdb-13.1/gdb/thread.c:85: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) [answered Y; input not from terminal]
This is the stack I am getting in aarch64
after adding a recursion guard to malloc:
(venv) root@64e4bedf6306:/src# eu-stack -p 7999 --verbose
PID 7999 - process
TID 7999:
#0 0x0000ffffb392b654 __futex_abstimed_wait_common64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:57:12
#1 0x0000ffffb392b654 __futex_abstimed_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:87:9
#2 0x0000ffffb392b654 __GI___futex_abstimed_wait_cancelable64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:139:10
#3 0x0000ffffb392e190 - 1 __pthread_cond_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:503:10
#4 0x0000ffffb392e190 - 1 ___pthread_cond_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:618:10
#5 0x0000ffffa4f219f4 - 1 base::ConditionVariable::Wait() - /usr/lib/aarch64-linux-gnu/libQt6WebEngineCore.so.6.4.2
#6 0x006affffa4f226b4 - 1 - /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
#7 0x006affffa4f226b4 - 1 - /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
#8 0x0009ffffa4f228bc - 1 - /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
#9 0x0010ffffa3690428 - 1 - /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
#10 0x0050ffffa47ee8ec - 1 - /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
#11 0x0062ffffa1e76694 - 1 - /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
#12 0x0047ffffa1e77d74 - 1 - /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
#13 0x0054ffffa1e5cd98 - 1 - /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
#14 0x0067ffffa4e4d230 - 1 - /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
#15 0x0000ffffa4e4d6f8 - 1 QWebEnginePage::QWebEnginePage(QObject*) - /usr/lib/aarch64-linux-gnu/libQt6WebEngineCore.so.6.4.2
#16 0x0000ffffb24c38c0 - 1
eu-stack: dwfl_thread_getframes tid 7999 at 0xffffb24c38bf in <unknown>: No DWARF information found
TID 8000:
#0 0x0000ffffb392b654 __futex_abstimed_wait_common64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:57:12
#1 0x0000ffffb392b654 __futex_abstimed_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:87:9
#2 0x0000ffffb392b654 __GI___futex_abstimed_wait_cancelable64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:139:10
#3 0x0000ffffb392e7a8 - 1 __pthread_cond_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:503:10
#4 0x0000ffffb392e7a8 - 1 ___pthread_cond_clockwait64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:682:10
#5 0x0000ffffb392e7a8 - 1 ___pthread_cond_clockwait64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:670:1
#6 0x0000ffffb213bc3c - 1 - /host_virtiofs/Users/pgalindo3/github/memray/src/memray/_memray.cpython-311-aarch64-linux-gnu.so
#7 0x0000ffffb213bc3c - 1 - /host_virtiofs/Users/pgalindo3/github/memray/src/memray/_memray.cpython-311-aarch64-linux-gnu.so
#8 0x0000ffffb21388bc - 1 - /host_virtiofs/Users/pgalindo3/github/memray/src/memray/_memray.cpython-311-aarch64-linux-gnu.so
#9 0x0000ffffb2138944 - 1 - /host_virtiofs/Users/pgalindo3/github/memray/src/memray/_memray.cpython-311-aarch64-linux-gnu.so
#10 0x0000ffffb2138ac4 - 1 - /host_virtiofs/Users/pgalindo3/github/memray/src/memray/_memray.cpython-311-aarch64-linux-gnu.so
#11 0x0000ffffb2138b34 - 1 - /host_virtiofs/Users/pgalindo3/github/memray/src/memray/_memray.cpython-311-aarch64-linux-gnu.so
#12 0x0000ffffb375e9dc - 1 execute_native_thread_routine - /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30
../../../../../src/libstdc++-v3/src/c++11/thread.cc:82:18
#13 0x0000ffffafd4eafc - 1
#14 0x0000ffffafd4eafc - 1
eu-stack: dwfl_thread_getframes tid 8000 at 0xffffafd4eafb in <unknown>: No DWARF information found
TID 8001:
#0 0x0000ffffb392b93c futex_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
../sysdeps/nptl/futex-internal.h:146:13
#1 0x0000ffffb392b93c __GI___lll_lock_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/lowlevellock.c:49:7
#2 0x0000ffffb3931f10 - 1 lll_mutex_lock_optimized - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_mutex_lock.c:48:5
#3 0x0000ffffb3931f10 - 1 ___pthread_mutex_lock - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_mutex_lock.c:93:7
#4 0x0000ffffb3bf4040 - 1 malloc_mutex_lock_final - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/mutex.h:151:2
#5 0x0000ffffb3bf4040 - 1 je_malloc_mutex_lock_slow - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/mutex.c:90:2
#6 0x0000ffffb3ba9a68 - 1 malloc_mutex_lock - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/mutex.h:217:4
#7 0x0000ffffb3ba9a68 - 1 je_arena_choose_hard - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/jemalloc.c:534:3
#8 0x0000ffffb3ba9d30 - 1 arena_choose_impl - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/jemalloc_internal_inlines_b.h:46:9
#9 0x0000ffffb3baa254 - 1 arena_choose - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/jemalloc_internal_inlines_b.h:88:9
#10 0x0000ffffb3baa254 - 1 tcache_alloc_small - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/tcache_inlines.h:56:11
#11 0x0000ffffb3baa254 - 1 arena_malloc - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/arena_inlines_b.h:151:11
#12 0x0000ffffb3baa254 - 1 iallocztm - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/jemalloc_internal_inlines_c.h:55:8
#13 0x0000ffffb3baa254 - 1 imalloc_no_sample - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/jemalloc.c:2398:9
#14 0x0000ffffb3baa254 - 1 imalloc_body - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/jemalloc.c:2573:16
#15 0x0000ffffb3baa254 - 1 imalloc - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/jemalloc.c:2687:10
#16 0x0000ffffb3baa254 - 1 je_malloc_default - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/jemalloc.c:2722:2
#17 0x0000ffffb3c03fa8 - 1 fallback_impl<false> - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/jemalloc_cpp.cpp:98:28
#18 0x0000ffffb2117d74 - 1 - /host_virtiofs/Users/pgalindo3/github/memray/src/memray/_memray.cpython-311-aarch64-linux-gnu.so
#19 0x0000ffffb2117d74 - 1 - /host_virtiofs/Users/pgalindo3/github/memray/src/memray/_memray.cpython-311-aarch64-linux-gnu.so
#20 0x0000ffffb2116100 - 1 - /host_virtiofs/Users/pgalindo3/github/memray/src/memray/_memray.cpython-311-aarch64-linux-gnu.so
#21 0x0000ffffb3bf6c88 - 1 os_pages_map - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/pages.c:149:9
#22 0x0000000000200000 - 1
#23 0x0000000000200000 - 1
eu-stack: dwfl_thread_getframes tid 8001 at 0x1fffff in <unknown>: No DWARF information found
TID 8002:
#0 0x0000ffffb392b654 __futex_abstimed_wait_common64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:57:12
#1 0x0000ffffb392b654 __futex_abstimed_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:87:9
#2 0x0000ffffb392b654 __GI___futex_abstimed_wait_cancelable64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:139:10
#3 0x0000ffffb392e190 - 1 __pthread_cond_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:503:10
#4 0x0000ffffb392e190 - 1 ___pthread_cond_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:618:10
#5 0x0000ffff9c4c35ec - 1 cnd_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:135:13
#6 0x0000ffff9ca27474 - 1 pipe_semaphore_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/auxiliary/os/os_thread.h:108:7
#7 0x0000ffff9ca27474 - 1 thread_function - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/drivers/llvmpipe/lp_rast.c:1184:7
#8 0x0000ffff9c4c34fc - 1 impl_thrd_routine - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:67:29
#9 0x0000ffffb392edd8 - 1 start_thread - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_create.c:442:8
#10 0x0000ffffb3997e9c - 1 thread_start - /usr/lib/aarch64-linux-gnu/libc.so.6
../sysdeps/unix/sysv/linux/aarch64/clone.S:79
TID 8003:
#0 0x0000ffffb392b654 __futex_abstimed_wait_common64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:57:12
#1 0x0000ffffb392b654 __futex_abstimed_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:87:9
#2 0x0000ffffb392b654 __GI___futex_abstimed_wait_cancelable64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:139:10
#3 0x0000ffffb392e190 - 1 __pthread_cond_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:503:10
#4 0x0000ffffb392e190 - 1 ___pthread_cond_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:618:10
#5 0x0000ffff9c4c35ec - 1 cnd_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:135:13
#6 0x0000ffff9ca27474 - 1 pipe_semaphore_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/auxiliary/os/os_thread.h:108:7
#7 0x0000ffff9ca27474 - 1 thread_function - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/drivers/llvmpipe/lp_rast.c:1184:7
#8 0x0000ffff9c4c34fc - 1 impl_thrd_routine - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:67:29
#9 0x0000ffffb392edd8 - 1 start_thread - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_create.c:442:8
#10 0x0000ffffb3997e9c - 1 thread_start - /usr/lib/aarch64-linux-gnu/libc.so.6
../sysdeps/unix/sysv/linux/aarch64/clone.S:79
TID 8004:
#0 0x0000ffffb392b654 __futex_abstimed_wait_common64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:57:12
#1 0x0000ffffb392b654 __futex_abstimed_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:87:9
#2 0x0000ffffb392b654 __GI___futex_abstimed_wait_cancelable64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:139:10
#3 0x0000ffffb392e190 - 1 __pthread_cond_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:503:10
#4 0x0000ffffb392e190 - 1 ___pthread_cond_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:618:10
#5 0x0000ffff9c4c35ec - 1 cnd_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:135:13
#6 0x0000ffff9ca27474 - 1 pipe_semaphore_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/auxiliary/os/os_thread.h:108:7
#7 0x0000ffff9ca27474 - 1 thread_function - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/drivers/llvmpipe/lp_rast.c:1184:7
#8 0x0000ffff9c4c34fc - 1 impl_thrd_routine - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:67:29
#9 0x0000ffffb392edd8 - 1 start_thread - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_create.c:442:8
#10 0x0000ffffb3997e9c - 1 thread_start - /usr/lib/aarch64-linux-gnu/libc.so.6
../sysdeps/unix/sysv/linux/aarch64/clone.S:79
TID 8005:
#0 0x0000ffffb392b654 __futex_abstimed_wait_common64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:57:12
#1 0x0000ffffb392b654 __futex_abstimed_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:87:9
#2 0x0000ffffb392b654 __GI___futex_abstimed_wait_cancelable64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:139:10
#3 0x0000ffffb392e190 - 1 __pthread_cond_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:503:10
#4 0x0000ffffb392e190 - 1 ___pthread_cond_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:618:10
#5 0x0000ffff9c4c35ec - 1 cnd_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:135:13
#6 0x0000ffff9ca27474 - 1 pipe_semaphore_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/auxiliary/os/os_thread.h:108:7
#7 0x0000ffff9ca27474 - 1 thread_function - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/drivers/llvmpipe/lp_rast.c:1184:7
#8 0x0000ffff9c4c34fc - 1 impl_thrd_routine - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:67:29
#9 0x0000ffffb392edd8 - 1 start_thread - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_create.c:442:8
#10 0x0000ffffb3997e9c - 1 thread_start - /usr/lib/aarch64-linux-gnu/libc.so.6
../sysdeps/unix/sysv/linux/aarch64/clone.S:79
TID 8006:
#0 0x0000ffffb392b654 __futex_abstimed_wait_common64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:57:12
#1 0x0000ffffb392b654 __futex_abstimed_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:87:9
#2 0x0000ffffb392b654 __GI___futex_abstimed_wait_cancelable64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:139:10
#3 0x0000ffffb392e190 - 1 __pthread_cond_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:503:10
#4 0x0000ffffb392e190 - 1 ___pthread_cond_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:618:10
#5 0x0000ffff9c4c35ec - 1 cnd_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:135:13
#6 0x0000ffff9ca27474 - 1 pipe_semaphore_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/auxiliary/os/os_thread.h:108:7
#7 0x0000ffff9ca27474 - 1 thread_function - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/drivers/llvmpipe/lp_rast.c:1184:7
#8 0x0000ffff9c4c34fc - 1 impl_thrd_routine - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:67:29
#9 0x0000ffffb392edd8 - 1 start_thread - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_create.c:442:8
#10 0x0000ffffb3997e9c - 1 thread_start - /usr/lib/aarch64-linux-gnu/libc.so.6
../sysdeps/unix/sysv/linux/aarch64/clone.S:79
TID 8007:
#0 0x0000ffffb392b654 __futex_abstimed_wait_common64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:57:12
#1 0x0000ffffb392b654 __futex_abstimed_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:87:9
#2 0x0000ffffb392b654 __GI___futex_abstimed_wait_cancelable64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:139:10
#3 0x0000ffffb392e190 - 1 __pthread_cond_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:503:10
#4 0x0000ffffb392e190 - 1 ___pthread_cond_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:618:10
#5 0x0000ffff9c4c35ec - 1 cnd_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:135:13
#6 0x0000ffff9ca24324 - 1 lp_cs_tpool_worker - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/drivers/llvmpipe/lp_cs_tpool.c:49:10
#7 0x0000ffff9c4c34fc - 1 impl_thrd_routine - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:67:29
#8 0x0000ffffb392edd8 - 1 start_thread - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_create.c:442:8
#9 0x0000ffffb3997e9c - 1 thread_start - /usr/lib/aarch64-linux-gnu/libc.so.6
../sysdeps/unix/sysv/linux/aarch64/clone.S:79
TID 8008:
#0 0x0000ffffb392b654 __futex_abstimed_wait_common64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:57:12
#1 0x0000ffffb392b654 __futex_abstimed_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:87:9
#2 0x0000ffffb392b654 __GI___futex_abstimed_wait_cancelable64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:139:10
#3 0x0000ffffb392e190 - 1 __pthread_cond_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:503:10
#4 0x0000ffffb392e190 - 1 ___pthread_cond_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:618:10
#5 0x0000ffff9c4c35ec - 1 cnd_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:135:13
#6 0x0000ffff9ca24324 - 1 lp_cs_tpool_worker - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/drivers/llvmpipe/lp_cs_tpool.c:49:10
#7 0x0000ffff9c4c34fc - 1 impl_thrd_routine - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:67:29
#8 0x0000ffffb392edd8 - 1 start_thread - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_create.c:442:8
#9 0x0000ffffb3997e9c - 1 thread_start - /usr/lib/aarch64-linux-gnu/libc.so.6
../sysdeps/unix/sysv/linux/aarch64/clone.S:79
TID 8009:
#0 0x0000ffffb392b654 __futex_abstimed_wait_common64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:57:12
#1 0x0000ffffb392b654 __futex_abstimed_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:87:9
#2 0x0000ffffb392b654 __GI___futex_abstimed_wait_cancelable64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:139:10
#3 0x0000ffffb392e190 - 1 __pthread_cond_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:503:10
#4 0x0000ffffb392e190 - 1 ___pthread_cond_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:618:10
#5 0x0000ffff9c4c35ec - 1 cnd_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:135:13
#6 0x0000ffff9ca24324 - 1 lp_cs_tpool_worker - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/drivers/llvmpipe/lp_cs_tpool.c:49:10
#7 0x0000ffff9c4c34fc - 1 impl_thrd_routine - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:67:29
#8 0x0000ffffb392edd8 - 1 start_thread - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_create.c:442:8
#9 0x0000ffffb3997e9c - 1 thread_start - /usr/lib/aarch64-linux-gnu/libc.so.6
../sysdeps/unix/sysv/linux/aarch64/clone.S:79
TID 8010:
#0 0x0000ffffb392b654 __futex_abstimed_wait_common64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:57:12
#1 0x0000ffffb392b654 __futex_abstimed_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:87:9
#2 0x0000ffffb392b654 __GI___futex_abstimed_wait_cancelable64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:139:10
#3 0x0000ffffb392e190 - 1 __pthread_cond_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:503:10
#4 0x0000ffffb392e190 - 1 ___pthread_cond_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:618:10
#5 0x0000ffff9c4c35ec - 1 cnd_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:135:13
#6 0x0000ffff9ca24324 - 1 lp_cs_tpool_worker - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/drivers/llvmpipe/lp_cs_tpool.c:49:10
#7 0x0000ffff9c4c34fc - 1 impl_thrd_routine - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:67:29
#8 0x0000ffffb392edd8 - 1 start_thread - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_create.c:442:8
#9 0x0000ffffb3997e9c - 1 thread_start - /usr/lib/aarch64-linux-gnu/libc.so.6
../sysdeps/unix/sysv/linux/aarch64/clone.S:79
TID 8011:
#0 0x0000ffffb392b654 __futex_abstimed_wait_common64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:57:12
#1 0x0000ffffb392b654 __futex_abstimed_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:87:9
#2 0x0000ffffb392b654 __GI___futex_abstimed_wait_cancelable64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:139:10
#3 0x0000ffffb392e190 - 1 __pthread_cond_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:503:10
#4 0x0000ffffb392e190 - 1 ___pthread_cond_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:618:10
#5 0x0000ffff9c4c35ec - 1 cnd_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:135:13
#6 0x0000ffff9ca24324 - 1 lp_cs_tpool_worker - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/gallium/drivers/llvmpipe/lp_cs_tpool.c:49:10
#7 0x0000ffff9c4c34fc - 1 impl_thrd_routine - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:67:29
#8 0x0000ffffb392edd8 - 1 start_thread - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_create.c:442:8
#9 0x0000ffffb3997e9c - 1 thread_start - /usr/lib/aarch64-linux-gnu/libc.so.6
../sysdeps/unix/sysv/linux/aarch64/clone.S:79
TID 8012:
#0 0x0000ffffb392b654 __futex_abstimed_wait_common64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:57:12
#1 0x0000ffffb392b654 __futex_abstimed_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:87:9
#2 0x0000ffffb392b654 __GI___futex_abstimed_wait_cancelable64 - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/futex-internal.c:139:10
#3 0x0000ffffb392e190 - 1 __pthread_cond_wait_common - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:503:10
#4 0x0000ffffb392e190 - 1 ___pthread_cond_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_cond_wait.c:618:10
#5 0x0000ffff9c4c35ec - 1 cnd_wait - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:135:13
#6 0x0000ffff9c481794 - 1 util_queue_thread_func - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/util/u_queue.c:290:10
#7 0x0000ffff9c4c34fc - 1 impl_thrd_routine - /usr/lib/aarch64-linux-gnu/dri/armada-drm_dri.so
../src/c11/impl/threads_posix.c:67:29
#8 0x0000ffffb392edd8 - 1 start_thread - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_create.c:442:8
#9 0x0000ffffb3997e9c - 1 thread_start - /usr/lib/aarch64-linux-gnu/libc.so.6
../sysdeps/unix/sysv/linux/aarch64/clone.S:79
TID 8013:
#0 0x0000ffffb392b93c futex_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
../sysdeps/nptl/futex-internal.h:146:13
#1 0x0000ffffb392b93c __GI___lll_lock_wait - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/lowlevellock.c:49:7
#2 0x0000ffffb3931f10 - 1 lll_mutex_lock_optimized - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_mutex_lock.c:48:5
#3 0x0000ffffb3931f10 - 1 ___pthread_mutex_lock - /usr/lib/aarch64-linux-gnu/libc.so.6
./nptl/pthread_mutex_lock.c:93:7
#4 0x0000ffffb3bf4040 - 1 malloc_mutex_lock_final - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/mutex.h:151:2
#5 0x0000ffffb3bf4040 - 1 je_malloc_mutex_lock_slow - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/mutex.c:90:2
#6 0x0000ffffb3ba9a68 - 1 malloc_mutex_lock - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/mutex.h:217:4
#7 0x0000ffffb3ba9a68 - 1 je_arena_choose_hard - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/jemalloc.c:534:3
#8 0x0000ffffb3c01f10 - 1 arena_choose_impl - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/jemalloc_internal_inlines_b.h:46:9
#9 0x0000ffffb3c01f10 - 1 arena_choose_impl - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/jemalloc_internal_inlines_b.h:32:1
#10 0x0000ffffb3c01f10 - 1 arena_choose - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/jemalloc_internal_inlines_b.h:88:9
#11 0x0000ffffb3c01f10 - 1 je_tsd_tcache_data_init - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/tcache.c:740:11
#12 0x0000ffffb3c02198 - 1 je_tsd_tcache_enabled_data_init - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/tcache.c:644:3
#13 0x0000ffffb3c03a0c - 1 tsd_data_init - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/tsd.c:244:9
#14 0x0000ffffb3c03a0c - 1 je_tsd_fetch_slow - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/tsd.c:297:5
#15 0x0000ffffb3baa028 - 1 tsd_fetch_impl - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/tsd.h:422:10
#16 0x0000ffffb3baa028 - 1 tsd_fetch - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
include/jemalloc/internal/tsd.h:448:9
#17 0x0000ffffb3baa028 - 1 imalloc - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/jemalloc.c:2681:15
#18 0x0000ffffb3baa028 - 1 je_malloc_default - /usr/lib/aarch64-linux-gnu/libjemalloc.so.2
src/jemalloc.c:2722:2
#19 0x0000ffffb2115154 - 1 - /host_virtiofs/Users/pgalindo3/github/memray/src/memray/_memray.cpython-311-aarch64-linux-gnu.so
#20 0x0000ffffb2115154 - 1 - /host_virtiofs/Users/pgalindo3/github/memray/src/memray/_memray.cpython-311-aarch64-linux-gnu.so
#21 0x0000ffffb3cd9698 - 1 malloc - /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
../include/rtld-malloc.h:56:10
#22 0x0000ffffb3cd9698 - 1 allocate_dtv_entry - /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
./elf/dl-tls.c:684:19
#23 0x0000ffffb3cd9698 - 1 allocate_and_init - /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
./elf/dl-tls.c:709:31
#24 0x0000ffffb3cd9698 - 1 tls_get_addr_tail - /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
./elf/dl-tls.c:907:31
#25 0x0000ffff8d9c6afc - 1
#26 0x0000ffff8d9c6afc - 1
eu-stack: dwfl_thread_getframes tid 8013 at 0xffff8d9c6afb in <unknown>: No DWARF information found
Seems that there are at least 2 threads that are allocating but they don't seem to be re-entrant (I am assuming the normal call is os_pages_map
to je_malloc_default
and we are just in the middle. Maybe I am missing something and we are still malloc-ing inside memray somehow.
Seems that this is enough to solve the problem:
diff --git a/src/memray/_memray/elf_shenanigans.cpp b/src/memray/_memray/elf_shenanigans.cpp
index 98ac32a..99221db 100644
--- a/src/memray/_memray/elf_shenanigans.cpp
+++ b/src/memray/_memray/elf_shenanigans.cpp
@@ -166,7 +166,7 @@ phdrs_callback(dl_phdr_info* info, [[maybe_unused]] size_t size, void* data) noe
patched.insert(info->dlpi_name);
}
- if (strstr(info->dlpi_name, "/ld-linux") || strstr(info->dlpi_name, "linux-vdso.so.1")) {
+ if (strstr(info->dlpi_name, "/ld-linux") || strstr(info->dlpi_name, "linux-vdso.so.1") || strstr(info->dlpi_name, "jemalloc")) {
// Avoid chaos by not overwriting the symbols in the linker.
// TODO: Don't override the symbols in our shared library!
return 0;
diff --git a/src/memray/_memray/hooks.cpp b/src/memray/_memray/hooks.cpp
index b46a037..f388cdc 100644
--- a/src/memray/_memray/hooks.cpp
+++ b/src/memray/_memray/hooks.cpp
@@ -165,8 +165,11 @@ void*
malloc(size_t size) noexcept
{
assert(hooks::malloc);
-
- void* ptr = hooks::malloc(size);
+ void* ptr;
+ {
+ tracking_api::RecursionGuard guard;
+ ptr = hooks::malloc(size);
+ }
tracking_api::Tracker::trackAllocation(ptr, size, hooks::Allocator::MALLOC);
return ptr;
}
I am still unsure how exactly the re-entrancy is happening because my gdb is busted :(
Here's the stack that we're deadlocking at on x86-64: https://gist.github.com/godlygeek/4cf3924b3d2be95f69a670f93672f0b1
2 threads are in jemalloc. The one that caused the deadlock is probably Thread 3
:
qthread_unix.cpp
tries to use a thread-local variable in a new thread__tls_get_addr
to call allocate_dtv_entry
malloc
to perform the allocationmalloc
hookjemalloc
to perform the allocationmmap
to allocate a new arenammap
hooknew std::vector<NativeTrace::ip_t>()
to create a vector for storing the instruction pointersjemalloc
while it was in the middle of allocating an arena and holding locks protecting its internal stateWell, seems that I cannot debug this on my aarch64 laptop :sweat:
A problem internal to GDB has been detected, further debugging may prove unreliable. ----- Backtrace ----- 0xaaaabe679b9b ??? 0xaaaabe9bcefb ??? 0xaaaabe9bd0e3 ??? 0xaaaabeb64873 ??? 0xaaaabe976ebb ??? 0xaaaabe8cf49b ??? 0xaaaabe5ca2bf ??? 0xaaaabe971a37 ??? 0xaaaabe971f47 ??? 0xaaaabe971ff7 ??? 0xaaaabe8782d7 ??? 0xffff914f5387 _td_fetch_value ./nptl_db/fetch-value.c:115 0xffff914f230f td_ta_map_lwp2thr ./nptl_db/td_ta_map_lwp2thr.c:194 0xaaaabe807d8f ??? 0xaaaabe809347 ??? 0xaaaabe973e73 ??? 0xaaaabe7d081f ??? 0xaaaabe7dc033 ??? 0xaaaabeb64d03 ??? 0xaaaabeb657f7 ??? 0xaaaabe980787 ??? 0xaaaabe980a6b ??? 0xaaaabe818dcb ??? 0xaaaabe818f1b ??? 0xaaaabe81acff ??? 0xaaaabe81b733 ??? 0xaaaabe5c1183 ??? 0xffff9883777f __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 0xffff98837857 __libc_start_main_impl ../csu/libc-start.c:381 0xaaaabe5c73af ??? 0xffffffffffffffff ??? --------------------- /build/gdb-yCDzia/gdb-13.1/gdb/thread.c:85: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) [answered Y; input not from terminal]
Hi @pablogsal - this question is off-topic from memray and PyQT unfortunately, but.. I've encountered a similar stacktrace to this on a Debian accessibility thread and was wondering what your interpretation of the meaning of this stacktrace was?
(that thread is also about a potential process/threading deadlock situation, but as far as I can tell, jemalloc
isn't in use there)
Is there an existing issue for this?
Current Behavior
Hello, me again! Now that I can do native runs with PyQt WebEngine I went back to some scenarios I've tested previously with a non-native run (for context I'm looking into https://github.com/qutebrowser/qutebrowser/issues/1476). I'm looking into how different memory allocaters might make the memory load of a particular application a bit lighter.
When running an application based on PyQt WebEngine with the jemalloc library in LD_PRELOAD the run hangs. With other mallocs (glibc, mimalloc, tcmalloc) the run completes fine.
Here's the stack traces, it seems to be hung up on a semaphore? Which sounds like it could be some fun timing thing. Also it looks like it is coming from DBus related code so probably it doesn't necessarily need all of QtWebEngine to reproduce.
(gdb) thread apply all bt
And here is a script to reproduce the freeze in a container (remove the
--native
or the whole-m memray run --native
bits to see it run through fully for all mallocs):Expected Behavior
No response
Steps To Reproduce
see above
Memray Version
latest build from https://github.com/bloomberg/memray/actions/runs/4427648999
Python Version
3.11
Operative System
Linux
Anything else?
No response