mcveanlab / mccortex

De novo genome assembly and multisample variant calling
https://github.com/mcveanlab/mccortex/wiki
MIT License
113 stars 25 forks source link

Link files should use bgzf block compression #8

Open noporpoise opened 9 years ago

noporpoise commented 9 years ago

When threading reads, a large amount of time is spent writing the link files (.ctp.gz) to disk. Currently the compression is single threaded. If many threads simultaneously compressed blocks in parallel, a single lock could be used to write one compressed block to disk at a time.

Not sure if we should use bgzf block compression in htslib or roll our own block headers.