aseyboldt / fastq-rs

MIT License
46 stars 13 forks source link

more efficient owned records, and use a fork of flate2 that supports mutli-part gzip #1

Closed pmarks closed 6 years ago

pmarks commented 7 years ago

thanks for the great lib!

FYI the required change in flate2 has been merged but not released: https://github.com/alexcrichton/flate2-rs/pull/43

It may just be worth waiting for the updated version of flate2, but wanted to make you aware of it.

aseyboldt commented 7 years ago

Thanks for the pull request. The MultiIGz change look good, but as you said I'd rather wait for a release of flate2. Do you know anything about when that might happen? Seems this has been sitting there for a month without change. About the OwnedRecord: The original idea behind using individual vecs, was that this makes it possible to modify the (length of) sequences easily – for example when trimming. But thinking about it now, I guess this could still be achieved with this faster version. Just out of interest, what are you using the OwnedRecord for?

pmarks commented 7 years ago

@aseyboldt - want to take a look at these updates?

aseyboldt commented 7 years ago

Sorry for not replying for so long. Thanks for the PR, the changes to Buffer.clean and MulitGzDecoder look good, but I'd rather keep the current OwnedRecord. It makes it possible to modify the record, and other than that I can't think of use cases for the owned record anyway (I might be missing something of course). Also, lz4 is important, a fast parser doesn't help much with gzip :-) I also prefer parasailors to rust-bio for alignments, it is way faster. It compiles fine for me, do you still have issues with it?