luizirber / niffler

Simple and transparent support for compressed files.
Apache License 2.0
75 stars 7 forks source link

Replace GzDecoder with MultiGzDecoder? #27

Closed luizirber closed 4 years ago

luizirber commented 4 years ago

Should we replace the GzDecoder from flate2 with MultiGzDecoder (which would also support BGZF)?

Are there other consequences to use MultiGzDecoder with regular gzip files? (apparently not, it didn't brake any of our tests)

from https://github.com/onecodex/needletail/pull/45#issuecomment-651235214

Also connected to #24, since niffler users might want to use specific backends instead of our default choices.

hcdenbakker commented 4 years ago

Wow, I was just going to suggest this (I am trying out nibbler for some bioinformatics applications), and ran into this issue while testing it. Please do!

luizirber commented 4 years ago

Wow, I was just going to suggest this (I am trying out nibbler for some bioinformatics applications), and ran into this issue while testing it. Please do!

Hah, cool! Do you want to test the multigz branch from #28 and see if it works for you? Or, given that you have more experience with BGZF, do you see any drawbacks of using MultiGzDecoder by default?

hcdenbakker commented 4 years ago

I do not see any or experienced any drawbacks of using MultiGzDecoder by default. I haven't done any serious bench marking, but if you work with Illumina data fresh from the machine MultiGzDecoder is your only option. I will test the multigz branch and will let you know what I find.

hcdenbakker commented 4 years ago

The multigz branch works great for me.

luizirber commented 4 years ago

2.1.1 released with this change, thanks for trying it @hcdenbakker!