biod / sambamba

Tools for working with SAM/BAM data
http://thebird.nl/blog/D_Dragon.html
GNU General Public License v2.0
563 stars 105 forks source link

Feature request: Build sambamba with libdeflate #450

Closed rhpvorderman closed 3 years ago

rhpvorderman commented 4 years ago

Libdeflate is a library that has zlib and gzip compatible bindings. Htslib can be build with libdeflate instead of zlib.

Most of the time spent writing BAM files is actually compressing them. Libdeflate is much faster than zlib, so I wonder if this can be an improvement for sambamba.

pjotrp commented 4 years ago

That is an interesting idea and probably easy to implement for someone with some C background. Ruben, are you interested in trying yourself, or should we look for someone?

rhpvorderman commented 4 years ago

I have no C background whatsoever. I also don't have experience with C++ or D, so integrating libdeflate into sambamba would be quite challenging for me.

pjotrp commented 3 years ago

Just for perspective, the binding to zlib is in zlib.d. Bgzf compression is where it is used (see also decompress.d). It should not be too hard, but I don't have time for it. For Sambamba libdeflate may help a little, though Sambamba does compress/decompress bgzf blocks in parallel - that is where its speed comes from.

I'll close the issue. If someone wants to work on it it can be reopened.