wolfpld / tracy

Frame profiler
https://tracy.nereid.pl/
Other
8.61k stars 592 forks source link

ON_DEMAND mode causes deadlock with static initalization. #775

Closed mcourteaux closed 2 months ago

mcourteaux commented 2 months ago

Here is a minimal reproducable:

#define TRACY_ON_DEMAND 1

#include <tracy/Tracy.hpp>
#include <mutex>

struct S {
  inline static TracyLockable(std::mutex, mutex);
};

int main() {
  std::printf("This never prints.\n");

  return 0;
}

with:

project(tracy-bug)

add_executable(bug "main.cpp")
find_package(Tracy REQUIRED)
target_compile_options(bug PUBLIC "-g")
target_link_libraries(bug PUBLIC Tracy::TracyClient)

which hangs here:

(gdb) bt
#0  futex_wait (private=128, expected=2, futex_word=0x55555574dfd8 <tracy::s_profiler+728>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0x55555574dfd8 <tracy::s_profiler+728>, private=128) at ./nptl/lowlevellock.c:49
#2  0x00007ffff7898002 in lll_mutex_lock_optimized (mutex=0x55555574dfd8 <tracy::s_profiler+728>) at ./nptl/pthread_mutex_lock.c:48
#3  ___pthread_mutex_lock (mutex=0x55555574dfd8 <tracy::s_profiler+728>) at ./nptl/pthread_mutex_lock.c:93
#4  0x0000555555560b54 in __gthread_mutex_lock (__mutex=0x55555574dfd8 <tracy::s_profiler+728>) at /usr/include/x86_64-linux-gnu/c++/11/bits/gthr-default.h:749
#5  0x0000555555560be6 in std::mutex::lock (this=0x55555574dfd8 <tracy::s_profiler+728>) at /usr/include/c++/11/bits/std_mutex.h:100
#6  0x000055555556096a in tracy::Profiler::DeferItem (item=..., this=0x55555574dd00 <tracy::s_profiler>) at /usr/local/include/client/TracyProfiler.hpp:718
#7  tracy::LockableCtx::LockableCtx (srcloc=0x5555555bd360 <S::{lambda()#1}::operator()() const::srcloc>, this=0x5555555be128 <S::mutex+40>) at /usr/local/include/client/TracyLock.hpp:33
#8  tracy::Lockable<std::mutex>::Lockable (srcloc=0x5555555bd360 <S::{lambda()#1}::operator()() const::srcloc>, this=<optimized out>) at /usr/local/include/client/TracyLock.hpp:185
#9  __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at /home/martijn/zec/3rd/tracy/bug/main.cpp:7
#10 0x0000555555560a84 in _GLOBAL__sub_I_main () at /home/martijn/zec/3rd/tracy/bug/main.cpp:19
#11 0x00007ffff7829ebb in call_init (env=<optimized out>, argv=0x7fffffffe068, argc=1) at ../csu/libc-start.c:145
#12 __libc_start_main_impl (main=0x5555555606cd <main()>, argc=1, argv=0x7fffffffe068, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe058) at ../csu/libc-start.c:379
#13 0x00005555555602f5 in _start ()

So, the m_deferredLock is the one being deadlocked on. There is two places that it's locked:

After some thinking and following the code paths of the Profiler::Worker(), I am unsure of where the problem hides. The second place where the lock is taken (in Profiler::Worker()) does not even get reached as far as I can tell, because that thread is waiting on ListenSocket::Accept().

So my other option is that this is an initialization-order problem, and the m_deferredLock is actually not initialized yet. However, following that route, through GetProfilerData() I don't see a way that the mutex is not properly initialized:

https://github.com/wolfpld/tracy/blob/d1300c491ca50962b5ba4d5b7d5a56b34b9e99ad/public/client/TracyProfiler.cpp#L1216-L1233

Line 1227 in particular should initialize it, before returning it to to call DeferItem() on, here: https://github.com/wolfpld/tracy/blob/d1300c491ca50962b5ba4d5b7d5a56b34b9e99ad/public/client/TracyLock.hpp#L17-L36

I'm don't really know how to proceed debugging this. This is related to #632.

mcourteaux commented 2 months ago

Aha! The debugger indicates: tracy::s_profiler+728 in the first line of the call-stack. This means that we are looking at this instead:

https://github.com/wolfpld/tracy/blob/d1300c491ca50962b5ba4d5b7d5a56b34b9e99ad/public/client/TracyProfiler.cpp#L1349-L1352

This looks prone to initialization order issues. This seems to be taken without TRACY_DELAYED_INIT.

mcourteaux commented 2 months ago

Okay, the MWE is even simpler, and doesn't involve making the Lock static:

#define TRACY_ON_DEMAND 1

#include <tracy/Tracy.hpp>
#include <mutex>

int main() {
  TracyLockable(std::mutex, mutex);
  std::printf("This never prints.\n");

  return 0;
}

The source location static initializer now causes the problem. I sort of conclude that this initialization-order issue renders TRACY_ON_DEMAND fully broken?

mcourteaux commented 2 months ago

I'm suspecting I have not respected the compilation rules of Tracy. I precompiled and installed Tracy on my system to reduce compilation times of my project, but the TRACY_ON_DEMAND flag also influences the static library for TracyClient.a.

I'll try this and report back.

mcourteaux commented 2 months ago

Yes, this fixes it. I switched back to CMake's add_subdirectory after setting the options I want. My bad... :shrug:

mcourteaux commented 2 months ago

Reflecting a bit about this, it'd be nice if Tracy can somehow check that the feature flags match across TUs or static libraries at link-time or run time. Perhaps we can figure out something elegant, that would save people in the future the equivalent of the day that I spent on this.

Perhaps something like this in tracy/Tracy.hpp:

inline static bool validate_feature_flags = [](){
    #ifdef TRACY_ON_DEMAND
    assert(tracy::g_ON_DEMAND);
    #else
    assert(!tracy::g_ON_DEMAND);
    #endif

    // same for the other flags...

    return true;
}();

Making use of the static inline linkage. This way, every TU will check if the command-line-passed flag (-DTRACY_ON_DEMAND) matches the one in the TracyClient.cpp.o object (or static library), at program initialization.

Where, of course, in TracyClient.cpp, we'd have:

#ifdef TRACY_ON_DEMAND
bool g_ON_DEMAND = true;
#else
bool g_ON_DEMAND = false;
#endif

The above will check it at runtime, but perhaps we can even enable this check on link-time, by simply ONLY declaring the feature flags that do exist.

namespace tracy {
extern bool g_ON_DEMAND;
static void validate_feature_flags() {
    #ifdef TRACY_ON_DEMAND
    assert(tracy::g_ON_DEMAND);
    #endif

    // same for the other flags...
};
}

The extern keyword will make the file compile, but make the linker report an undefined symbol when the symbol cannot be found in the other TUs or (precompiled) static library. The static will not make the symbol visible during link-time and not violate the ODR. For the sake of completeness, TracyClient.cpp would then contain:

#ifdef TRACY_ON_DEMAND
bool g_ON_DEMAND = true;
#endif

The nice thing is that the runtime-overhead is zero in time, and a few bytes for storing those bool globals. Note that we don't even need to call validate_feature_flags() at runtime nor compile time. However, this has the problem that if TracyClient.cpp was compiled WITH a particular flag, and the user-files are compiled WITHOUT it, this will not be detected... Fixing this could be done by creating opposite flags to mark that they are not present:

#ifdef TRACY_ON_DEMAND
bool g_ON_DEMAND = true;
else
bool g_not_ON_DEMAND = true;
#endif

with

namespace tracy {
extern bool g_ON_DEMAND;
static void validate_feature_flags() {
    #ifdef TRACY_ON_DEMAND
    assert(tracy::g_ON_DEMAND);
    #else
    assert(tracy::g_not_ON_DEMAND);
    #endif

    // same for the other flags...
};
}

(Tagging you explicitly, as I already closed this, so IDK if you will get any notification if I don't. @wolfpld)

wolfpld commented 2 months ago

Compiler optimizations will remove any dead code and the linker will never see these variables.