plasma-umass / coz

Coz: Causal Profiling
Other
4.03k stars 160 forks source link

Infinite recursion when libpthread.so.0 cannot be dynamically loaded #229

Closed mbUSC closed 1 month ago

mbUSC commented 4 months ago

In get_pthread_handle (real.cpp), dlopen can fail to load "libpthread.so.0" and return a NULL pointer. This condition is not properly handled by Coz.

How to reproduce the problem:

Build https://github.com/ClickHouse/ClickHouse. Try profiling the clickhouse_server program with Coz. It fails silently (no output).

Debugging:

Running Coz with strace shows that the child process terminates with SIGSEGV. If you capture a core dump, you can see that there is a stack overflow due to infinite recursion in pthread_cond_broadcast. This is because RTLD_DEFAULT is defined as NULL in include/dlfcn.h, so dlsym interprets the NULL handle passed to it as RTLD_DEFAULT. This causes the lookup for the "real" function to use the default search order, hence returning the overridden symbol from libcoz.so instead of the actual pthreads implementation.