bsc-performance-tools / extrae

Instrumentation framework to generate execution traces of the most used parallel runtimes.
https://tools.bsc.es/extrae
GNU Lesser General Public License v2.1
58 stars 35 forks source link

PTHREAD test fails with Segmentation fault in Extrae 4.1.6 #104

Closed julianmorillo closed 1 week ago

julianmorillo commented 1 month ago

The test provided with the source code under tests/functional/tracer/PTHREAD fails with the following output:

Welcome to Extrae 4.1.6
Extrae: Parsing the configuration file (extrae.xml) begins
Extrae: Tracing package is located on /home/harald/aplic/extrae/3.3.0rc
Extrae: Generating intermediate files for Paraver traces.
Extrae: pthread routines will NOT collect HW counters information.
Extrae: Dynamic memory instrumentation is disabled.
Extrae: Basic I/O memory instrumentation is disabled.
Extrae: System calls instrumentation is disabled.
Extrae: Parsing the configuration file (extrae.xml) has ended
Extrae: Intermediate traces will be stored in /tmp/eb/easybuild/build/Extrae/4.1.6/gompi-2023b/extrae-4.1.6/tests/functional/tracer/PTHREAD
Extrae: Tracing mode is set to: Detail.
Extrae: Successfully initiated with 1 tasks and 1 threads

Segmentation fault (core dumped)

Running just the binary ./pthread works fine and preloading any other library not tracing pthreads (for example, libseqtrace.so) also prevents the Segmentation fault to occur. I'm running in arriesgado-4 (RISC-V machine).

gllort commented 3 weeks ago

Is this the same issue that was reported by e-mail? If so, for reference, please try commenting out the following call at src/tracer/wrappers/pthread/pthread_wrapper.c at line 240:

//Backend_Flush_pThread (pthread_self());

Does this fix the issue?

julianmorillo commented 3 weeks ago

Yes, it is the same issue that I reported by e-mail.

Yes!, this fix the issue... Now the PTHREAD test pass. Is this a proper fix or just a workaround?

gllort commented 2 weeks ago

It is a sort of a proper fix. The call to Backend_Flush_pThread happens at the end of the "start_routine" executed by pthread_create before the thread dies, and makes it write its own trace data to disk. The problem with this is that a thread might not finish before the main thread and never exit the "start_routine", so its flush of the trace never happens, and we need the master thread to do a final flush at the end of the process to recover all thread's data. Having two points in the execution where a given pthread's data might be dumped concurrently is generating some known race condition problems, and since we can't avoid the final flush to deal with the threads that don't finish, we'll probably eliminate the first flush point.