EddyRivasLab / hmmer

HMMER: biological sequence analysis using profile HMMs
http://hmmer.org
Other
305 stars 69 forks source link

A data race in jackhmmer v3.3 #278

Closed Augustin-Zidek closed 1 year ago

Augustin-Zidek commented 2 years ago

I ran Thread Sanitizer on our Jackhmmer test with --cpu=8 (the issue doesn't show up with --cpu=1) and it found this issue:

==================
WARNING: ThreadSanitizer: data race (pid=8858)
  Write of size 4 at 0x7b200003a100 by main thread:
    #0 esl_threads_WaitForFinish hmmer/easel/esl_threads.c:209:23 
    #1 thread_loop hmmer/src/jackhmmer.c:1701:7 
    #2 serial_master hmmer/src/jackhmmer.c:651:29 
    #3 main hmmer/src/jackhmmer.c:415:16 

  Previous read of size 4 at 0x7b200003a100 by thread T19:
    #0 esl_threads_Started hmmer/easel/esl_threads.c 
    #1 pipeline_thread hmmer/src/jackhmmer.c:1724:3 

  Location is heap block of size 128 at 0x7b200003a100 allocated by main thread:
    #0 calloc llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:687:5 
    #1 esl_threads_Create hmmer/easel/esl_threads.c:58:3 
    #2 serial_master hmmer/src/jackhmmer.c:538:19 
    #3 main hmmer/src/jackhmmer.c:415:16 

  Thread T19 
    #0 pthread_create llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1022:3 
    #1 esl_threads_AddThread hmmer/easel/esl_threads.c:128:7 
    #2 serial_master hmmer/src/jackhmmer.c:646:23 
    #3 main hmmer/src/jackhmmer.c:415:16 

SUMMARY: ThreadSanitizer: data race hmmer/easel/esl_threads.c:209:23 in esl_threads_WaitForFinish

This is with hmmer v3.3 downloaded on 2020-04-07, the problematic line in Jackhmmer is the following line in esl_threads.c:

if (pthread_join(obj->threadId[w], NULL) != 0) ESL_EXCEPTION(eslESYS, "pthread join failed");
obj->threadCount--;  // THIS LINE

This issue can be fixed by making threadCount atomic in the ESL_THREADS struct in esl_threads.h, i.e. changing:

int      threadCount;      /* number of active worker threads                           */

to

#include <stdatomic.h>
// ...
atomic_int      threadCount;      /* number of active worker threads                           */

I am however unsure whether this is a proper fix, or whether it indicates there is a more serious underlying issue. Could you advise whether this fix is correct?

cryptogenomicon commented 1 year ago

See comment on #277; both #278 and #277 are addressed by the same set of three commits.