dcjones / quip

Compressing next-generation sequencing data with extreme prejudice.
http://www.cs.washington.edu/homes/dcjones/quip/
BSD 3-Clause "New" or "Revised" License
78 stars 10 forks source link

Inefficiency when compressing reads of variable length. #11

Open dcjones opened 11 years ago

dcjones commented 11 years ago

Quip performs somewhat sub-optimally when compressing reads that are not mostly of the same length.

Background: Currently, whenever a read of differing length is encountered, the new length is output uncompressed. In files where the read length changes for every read, this leads to an unnecessary inflation of the file size.