ROCm / omnitrace

Omnitrace: Application Profiling, Tracing, and Analysis
https://rocm.docs.amd.com/projects/omnitrace/en/latest/
MIT License
297 stars 27 forks source link

Still an issue related to "Segmentation fault in multi-threaded code" #315

Closed gmarkomanolis closed 9 months ago

gmarkomanolis commented 11 months ago

The user who have reported the issue #304 tried the new version and it fails:

Backtrace (demangled): [PID=22896][TID=2040][0/7] restore_rt [PID=22896][TID=2040][1/7] std::_Hashtable<unsigned long, std::pair<unsigned long const, omnitrace::(anonymous namespace)::cid_data>, std::allocator<std::pair<unsigned long const, omnitrace::(anonymous namespace)::cid_data>>, std::detail::_Select1st, std::equal_to, std::hash, std::detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true>>::_M_find_node(unsigned long, unsigned long const&, unsigned long) const [clone .isra.848] +0x14 [PID=22896][TID=2040][2/7] omnitrace::hip_activity_callback(char const, char const, void) +0x1321 [PID=22896][TID=2040][3/7] roctracer::MemoryPool::reader_fun(void) +0x6a [PID=22896][TID=2040][4/7] omnitrace::component::pthread_create_gotcha::wrapper::operator()() const +0x144 [PID=22896][TID=2040][5/7] omnitrace::component::pthread_create_gotcha::wrapper::wrap(void*) +0xa2 [PID=22896][TID=2040][6/7] start_thread +0xdc

Backtrace (lineinfo): [PID=22896][TID=2040][0/6] [/lib64/libpthread.so.0:?] __restore_rt [PID=22896][TID=2040][1/6] [/home/omnitrace/source/lib/omnitrace/library/roctracer.cpp:944] omnitrace::hip_activity_callback(char const, char const, void) [/usr/include/c++/7/bits/unordered_map.h:920] find [/usr/include/c++/7/bits/hashtable.h:1426] find [PID=22896][TID=2040][2/6] [??:?] roctracer::MemoryPool::reader_fun(void) [PID=22896][TID=2040][3/6] [/home/omnitrace/source/lib/omnitrace/library/components/pthread_create_gotcha.cpp:276] omnitrace::component::pthread_create_gotcha::wrapper::operator()() const [PID=22896][TID=2040][4/6] [/home/omnitrace/source/lib/omnitrace/library/components/pthread_create_gotcha.cpp:308] omnitrace::component::pthread_create_gotcha::wrapper::wrap(void*) [PID=22896][TID=2040][5/6] [/lib64/libpthread.so.0:?] start_thread

jrmadsen commented 10 months ago

Tell them to disable critical-tracing support.

jrmadsen commented 10 months ago

That error is a data race for some critical trace data. Critical tracing has always been alpha and development for supporting it has effectively been abandoned in favor of causal profiling. Realistically, it should be removed at this point.

I think the option is OMNITRACE_CRITICAL_TRACE=OFF

jrmadsen commented 9 months ago

This will be fixed by OMNITRACE_CRITICAL_TRACE=OFF. Critical tracing will be removed soon as it was incomplete due to it being superseded by causal profiling.