Closed eddy16112 closed 1 year ago
The value from gdb seems to be valid:
(gdb) p cudie
$1 = (Dwarf_Die *) 0x7fffdc184c20
(gdb) p *cudie
$2 = {addr = 0x7fffca6665df, cu = 0x7fffdc6bd6b0, abbrev = 0x7fffdc6ccfd0, padding__ = 0}
(gdb) p/x trace_addr
$3 = 0x5555570755ee
(gdb) p/x mod_bias
$4 = 0x555555554000
Maybe try under valgrind. I wouldn't rule out a bug in libdw
. You can also try a different backend, like libbfd
to compare.
I tried DBACKWARD_HAS_BACKTRACE_SYMBOL
, and it works. I can not get libbfd
compiled, because it needs libiberty-dev
, which can not be retrieved from conda.
The valgrind is a little bit tricky, because my program is a GASNet program and valgrind complains about GASNet and abort.
Get another backtrace, also segfault:
#6 0x00007f450f40c92a in dwarf_highpc () from /lib/x86_64-linux-gnu/libdw.so.1
#7 0x00007f450f40f868 in dwarf_ranges () from /lib/x86_64-linux-gnu/libdw.so.1
#8 0x000055f99ee3f410 in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::die_has_pc (die=0x55f9a45555e0, pc=28448202) at /scratch2/wwu/legion-master/runtime/realm/new_backward.hpp:2000
#9 0x000055f99ee4180b in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::deep_first_search_by_pc<backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::inliners_search_cb> (parent_die=0x55f9a4555670, pc=28448202, cb=...)
at /scratch2/wwu/legion-master/runtime/realm/new_backward.hpp:2065
#10 0x000055f99ee417ea in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::deep_first_search_by_pc<backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::inliners_search_cb> (parent_die=0x55f9a4555700, pc=28448202, cb=...)
at /scratch2/wwu/legion-master/runtime/realm/new_backward.hpp:2062
#11 0x000055f99ee417ea in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::deep_first_search_by_pc<backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::inliners_search_cb> (parent_die=0x7f44f89e81c0, pc=28448202, cb=...)
at /scratch2/wwu/legion-master/runtime/realm/new_backward.hpp:2062
#12 0x000055f99ee3f022 in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::resolve (this=0x55f9a45558a0, trace=...) at /scratch2/wwu/legion-master/runtime/realm/new_backward.hpp:1922
Another issue, BACKWARD_HAS_BACKTRACE
and BACKWARD_HAS_UNWIND
generate different backtrace.
BACKWARD_HAS_BACKTRACE
(wrong, see frame 6)
#16 Source "../sysdeps/unix/sysv/linux/x86_64/clone.S", line 97, in __clone
#15 Source "/build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c", line 478, in start_thread
#14 Source "/scratch2/wwu/legion-master/runtime/realm/threads.cc", line 784, in pthread_entry
#13 Source "/scratch2/wwu/legion-master/runtime/realm/threads.inl", line 98, in thread_entry_wrapper<Realm::ThreadedTaskScheduler, &Realm::ThreadedTaskScheduler::scheduler_loop_wlock>
#12 Source "/scratch2/wwu/legion-master/runtime/realm/tasks.cc", line 1216, in scheduler_loop_wlock
#11 Source "/scratch2/wwu/legion-master/runtime/realm/tasks.cc", line 1105, in scheduler_loop
#10 Source "/scratch2/wwu/legion-master/runtime/realm/tasks.cc", line 1367, in execute_task
#9 Source "/scratch2/wwu/legion-master/runtime/realm/tasks.cc", line 304, in execute_on_processor
#8 Source "/scratch2/wwu/legion-master/runtime/realm/proc_impl.cc", line 1129, in execute_task
#7 Source "/scratch2/wwu/legion-master/runtime/legion/legion.inl", line 21167, in legion_task_wrapper<daxpy_task>
#6 Source "/scratch2/wwu/legion-master/tutorial/07_partitioning/partitioning.cc", line 278, in check_task
#5 Object "/usr/lib/x86_64-linux-gnu/libc-2.31.so", at 0x7f714c1a8fd6, in
#4 | Source "/build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c", line 970, in get_sysdep_segment_value
Source "/build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c", line 509, in _nl_load_domain
#3 Source "/build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c", line 81, in abort
#2 Source "../sysdeps/unix/sysv/linux/raise.c", line 51, in raise
#1 Object "/usr/lib/x86_64-linux-gnu/libpthread-2.31.so", at 0x7f714c6eb420, in __restore_rt
#0 Source "/scratch2/wwu/legion-master/runtime/realm/runtime_impl.cc", line 2876, in realm_backtrace
(py39) wwu@g0004:/scratch2/wwu/legion-master/tutorial/07_partitioning$ ./partitioning -ll:cpu 1 -ll:force_kthreads
BACKWARD_HAS_UNWIND
(correct, see frame 8)
#19 Object "", at 0xffffffffffffffff, in
#18 Source "../sysdeps/unix/sysv/linux/x86_64/clone.S", line 95, in __clone
#17 Source "/build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c", line 477, in start_thread
#16 Source "/scratch2/wwu/legion-master/runtime/realm/threads.cc", line 781, in pthread_entry
#15 Source "/scratch2/wwu/legion-master/runtime/realm/threads.inl", line 97, in thread_entry_wrapper<Realm::ThreadedTaskScheduler, &Realm::ThreadedTaskScheduler::scheduler_loop_wlock>
#14 Source "/scratch2/wwu/legion-master/runtime/realm/tasks.cc", line 1217, in scheduler_loop_wlock
#13 Source "/scratch2/wwu/legion-master/runtime/realm/tasks.cc", line 1105, in scheduler_loop
#12 Source "/scratch2/wwu/legion-master/runtime/realm/tasks.cc", line 1366, in execute_task
#11 Source "/scratch2/wwu/legion-master/runtime/realm/tasks.cc", line 302, in execute_on_processor
#10 Source "/scratch2/wwu/legion-master/runtime/realm/proc_impl.cc", line 1129, in execute_task
#9 Source "/scratch2/wwu/legion-master/runtime/legion/legion.inl", line 21165, in legion_task_wrapper<daxpy_task>
#8 Source "/scratch2/wwu/legion-master/tutorial/07_partitioning/partitioning.cc", line 256, in daxpy_task
#7 Source "/build/glibc-SzIz7B/glibc-2.31/assert/assert.c", line 101, in __assert_fail
#6 Source "/build/glibc-SzIz7B/glibc-2.31/assert/assert.c", line 92, in __assert_fail_base
#5 Source "/build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c", line 79, in abort
#4 Source "../sysdeps/unix/sysv/linux/raise.c", line 51, in raise
#3 Object "/usr/lib/x86_64-linux-gnu/libpthread-2.31.so", at 0x7f10d256341f, in
#2 Source "/scratch2/wwu/legion-master/runtime/realm/runtime_impl.cc", line 2875, in realm_backtrace
#1 Source "/scratch2/wwu/legion-master/runtime/realm/new_backward.hpp", line 878, in load_here
#0 Source "/scratch2/wwu/legion-master/runtime/realm/new_backward.hpp", line 860, in unwind<backward::StackTraceImpl<backward::system_tag::linux_tag>::callback>
Interesting that Valgrind complains about GASNet. I only learned about GASNet now, and I do not understand what it does that upsets Valgrind.
Note that there is two level of dependencies in backward-cpp. Stack walking, and trace resolving.
Walking the stack find out which functions was traversed. You get the object file and memory address of the machine code. This is what you select with BACKWARD_HAS_UNWIND, BACKWARD_HAS_LIBUNWIND and BACKWARD_HAS_BACKTRACE (see https://github.com/bombela/backward-cpp/blob/master/backward.hpp#L98).
The second level is one of the library to read the debug symbols from the object files. Which is used to map object file to source file and line numbers.
In your two traces, unwind
should give you the correct stack walking as it comes from the compiler itself.
Get another backtrace, also segfault:
#6 0x00007f450f40c92a in dwarf_highpc () from /lib/x86_64-linux-gnu/libdw.so.1 #7 0x00007f450f40f868 in dwarf_ranges () from /lib/x86_64-linux-gnu/libdw.so.1 #8 0x000055f99ee3f410 in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::die_has_pc (die=0x55f9a45555e0, pc=28448202) at /scratch2/wwu/legion-master/runtime/realm/new_backward.hpp:2000 #9 0x000055f99ee4180b in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::deep_first_search_by_pc<backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::inliners_search_cb> (parent_die=0x55f9a4555670, pc=28448202, cb=...) at /scratch2/wwu/legion-master/runtime/realm/new_backward.hpp:2065 #10 0x000055f99ee417ea in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::deep_first_search_by_pc<backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::inliners_search_cb> (parent_die=0x55f9a4555700, pc=28448202, cb=...) at /scratch2/wwu/legion-master/runtime/realm/new_backward.hpp:2062 #11 0x000055f99ee417ea in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::deep_first_search_by_pc<backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::inliners_search_cb> (parent_die=0x7f44f89e81c0, pc=28448202, cb=...) at /scratch2/wwu/legion-master/runtime/realm/new_backward.hpp:2062 #12 0x000055f99ee3f022 in backward::TraceResolverLinuxImpl<backward::trace_resolver_tag::libdw>::resolve (this=0x55f9a45558a0, trace=...) at /scratch2/wwu/legion-master/runtime/realm/new_backward.hpp:1922
Sometimes your traces are sorted in a different order. Is this intended? Where are the frames 0 to 5? The segfault should be at frame 0 was it inside the dwarf library?
I figured out the problem. The stack allocated for handling the signal is too small, increasing the stack fixed the problem
I am using gcc 9.4 on linux with both
BACKWARD_HAS_DW=1
andBACKWARD_HAS_BACKTRACE=1
, and use single thread to call backward API, and then I occasionally catch a segfault from libdw. Here is the backtrace from gdbAny idea about the segfault?