Open esamorodnitsky opened 3 years ago
Hmm, that's strange I've not heard of that before. You're right that when multi-threaded, svaba will do different genomic windows in non-deterministic order because the threads will just grab them from a queue. It shouldn't get into a loop though since once a window is taken out of the queue it shouldn't be available to other threads. Nevertheless I believe you, even more so if this doesn't happen on single thread. Any clues as to how to try and recreate?
I am not 100% sure how you could reproduce, but maybe I could send you a dataset that my lab uses? The infinite loops may be caused by the exact specific kinds of DNA enrichment that my lab uses, which maybe you didn't expect to be fed into SvABA when you designed it.
If it's not protected data (usually it is), then I could try to take a look, finally getting some open time to work on this stuff. However, I doubt, but have been surprised, that it has to do with the molecules themselves but rather some quirk about how the thread queues are processed.
I am talking with my lab to see if it's okay for me to send you this data (as long as I remove patient identifiers). If it is not the molecules, it could also have to do with the HPC system that I and how it's set up. Will get back to you shortly!
Sometimes, SvABA runs into infinite loops. When I rerun the job exactly as is, it goes through to completion just fine. Without actually knowing the code, I think it might have to do with multi-threading and randomness.