Closed rhalperin closed 2 years ago
Thanks for reporting this.
"D" state is uninterruptable state, often in Kernel space, typically IO. We will look in to this.
Looks like bottleneck has to do with my use of D's std.algorithm : chunkBy
and dhtslib
s SAMReader.all_records
range. I have experienced issues with this before, though I didn't realize it affected fade
. Should have a fix out soon, just need to replace D's chunkBy
algorithm.
Interesting/surprising that it would manifest as interruptible sleep which again I believe is usually IO related
@rhalperin Can you try v0.5.7
? This should be fixed now.
That worked, it now runs in 30sec on the 0.5G bam, thanks!
Using a trivially sized 0.5G bam for testing purposes, i found that it took ~10hrs to run
fade out
on the output ofsamtools sort -n
. Watching the process ontop
it appears to have low cpu usage, and spends alot of time in the 'D' state. In comparison, runningfade out
on the same bam without sorting took about 20sec. I am seeing the same behavior running fade on a cloud workstation with ubuntu 18.04 and fade installed via conda as well as running in the blachlylab/fad docker image on my mac.