blachlylab / dhtslib

D bindings and OOP wrappers for htslib
MIT License
7 stars 1 forks source link

Areas for performance improvement #65

Open charlesgregory opened 3 years ago

charlesgregory commented 3 years ago

If we are ever going to have a "performance mode", we need to know what areas need compiler switches for performance. This could be done either by benchmarking or by simple algorithmic analysis (eyeballing it). (somewhat related to #56)

jblachly commented 3 years ago

TBH I think we just need to run some client programs on big datasets and then do performance analysis flamegraphs, callgraphs , whatever

In some cases, poor performance may be out of our control, for example, the string handling in htslib for VCFs was super bad until 1.11 or 1.12 IIRC

charlesgregory commented 3 years ago

From my attempts at making SIMD work for my own code with dhtslib, bgzf decompression is almost always the biggest bottleneck (when dealing with bam files). So there may be very little to be gained without the hassle of just resorting to C (and even then those gains are likely small).