Closed joe-angell closed 4 years ago
I tested this against current Master branch and the outputs match.
hts_AdapaterTrimmer is around 20% faster than the old single-threaded version in my tests (both static and dynamic builds), and much faster than hts_Overlapper which is now the slowest, about 1/2 as fast as the fastest tools.
From my testing this improves performance by about 25% for the one dataset I have. It still seems largely io bound, but if the reads are longer it would result in larger improvement, or if the data is uncompressed.
I changed it so the main thread reads in the data, then adds jobs to a worker queue. The "number-of-threads" parameter controls the number of worker threads that read from that queue and do the adapter trimming. The results are stored as a future, and added to another queue which is read by a single output thread. This way the order of reads is maintained as the same as the input. This example could be used on any of the algorithms where each read or pair of reads is independent.
We need a bit of refactoring to get rid of the dynamic casts, but I can do that in a new pr. Also coming up will be a pr to get rid of all the warnings.