ksahlin / strobemers

A repository for generating strobemers and evalaution
75 stars 12 forks source link

OMP parallellism generates wrong strobe<--> read associations in the index #14

Open blinard-BIOINFO opened 6 months ago

blinard-BIOINFO commented 6 months ago

@ksahlin

Fix: we had to comment this line in main.cpp to cancel parallelism:

//#pragma omp parallel for num_threads(n_threads) shared(read_cnt, output_file, q_id) private(acc,seq_rc, query_mers,query_mers_rc)

Bug observed with parallelism:

It seems that the omp pragma is incorrect and variables q_id or read_cnt have the wrong value or are read incorrectly by the threads (probably updated by another threads beforehand?).

As a consequence, the full set of strobemers that should be associated to a given query can be actually partially associated to a other queries in the final index. Because is it related to parallelism, it is hard to reproduce with an example.

Maybe variables used query identifiers should be made more atomic ?

ksahlin commented 5 months ago

I see, thanks for the report!

In more optimized implementations of the multiple query mapping problem we have removed OpenMP and are instead relying on std::thread - which makes the parallelisation more efficient, especially as the number of cores increase, using a "single producer multiple consumers" solution, e.g., as in strobealign.

Nevertheless, it is good to know this for future projects using the implementation in this repository. The easy way out for me is to, like you, deactivate parallelisation in this repo. I may add a PC solution later if I have time or find someone to implement it with threads.