Closed jkbonfield closed 1 year ago
Extra data for other data sets (including duplicating Novaseq data from above). I stuck with a clang 13 -O2 and one CPU rather than testing everything, as that combination seemed both realistic and showed a considerable benefit. Pleasing to see it applies well on other data too.
Xeon Gold 6142, clang13 -O2, diff data sets
novaseq 9.95 8.85 -11.1%
revio 23.33 19.46 -16.6%
ultima 195.20 177.71 -9.0%
ONT 68.27 60.67 -11.1%
Working on fixing it! Turns out my trivial 2 line SAM file for testing wasn't exactly enough. :/
None of this is huge, but it all adds up.
Improve the MD/NM generation in CRAM decoding. With decode_md=1 (default) by decode changed from 12.91s to 12.57s With decode_md=0 it's 11.92, so that's 1/3rd of the overhead removed.
Changed the block_resize to resize in slightly smaller chunks and to use integer maths.
Reduce excessive pointer redirection in cram_decode_seq.
Unsure if this speeds things up much (sometimes it seems to), but it provides tidier code too.
Combined before and after on 10 million NovaSeq CRAM (v3.1)
epyc 7543
Xeon Gold 6142
Biggest change is with clang, but also on Intel we see bigger changes than AMD too.