appliedbinf / URDO-SMOREd

Sequence Matching fOr REpiratory Diseases, SMORE'D, is a command-line sequence classification tool tailored to meet the needs of the Undiagnosed Respiratory Disease Outbreak (URDO) branch at CDC. SMORE'D is a k-mer based classification tool capable of rapidly classifying read sequences generated by multi-pathogen detection platforms.
Other
1 stars 1 forks source link

SMORE'D crashing with too many threads #17

Closed annagaines closed 4 years ago

ar0ch commented 4 years ago

I've tracked this down to VSEARCH dying (reproducible) in async threads. There aren't any logs as to why it's dying so I'm going to split off a new branch with some additional debug output added in

ar0ch commented 4 years ago

Alright, looks like this is caused by running into the system process limit because vsearch tries to use 20 threads by default creating n + (n*20) threads where n is the threads set for SMOREd. The stop gap measure of setting --threads 1 should prevent this from happening on most systems for now. So -- partially fixed in aadf5a2fa8915e402d47ab76dc911ded0ccbd49a, I need to explore our options for raising this exception.

ar0ch commented 4 years ago

I'm inclined to mark this as closed/wont-fix. I haven't found a reliable way trap this condition (different systems raise different errors). The optional enhancement is to retry processing samples that don't appear in the results dict after read_processor completes

Thoughts @lavanyarishishwar ?

ar0ch commented 4 years ago

SMORED now raises a generic exception when VSEARCH dies prematurely or doesn't process an output see new lines 110-129

ar0ch commented 4 years ago

New tests need to be written to cover some of this new code however