muellan / metacache

memory efficient, fast & precise taxnomomic classification system for metagenomic read mapping
GNU General Public License v3.0
57 stars 12 forks source link

Support for gzipped reads #12

Closed donovan-h-parks closed 3 years ago

donovan-h-parks commented 4 years ago

Hi.

Are there plans to support reads in compressed gzipped format (i.e. my_reads.fq.gz)? This would be a major help for incorporating MetaCache into workflows.

Cheers, Donovan

muellan commented 4 years ago

Yes, it's on my list for the next major version, but we are currently ironing out some bugs and also working on some improvements. So, it may take some time.

donovan-h-parks commented 4 years ago

Hi. Thank you for the quick response. In my testing, MetaCache is certainly among the best performing classifiers available. Are any of the upcoming bug fixes critical?

muellan commented 4 years ago

There's currently a bug that was introduced in the last version. It leads to unnecessarily high memory consumption during database builds. The fix is already implemented and will be released shortly. There are some other minor things, nothing that would affect the classification results.

donovan-h-parks commented 4 years ago

Thanks. I'm currently using v0.9.0 so perhaps have avoided these issues.

jdwinkler-lanzatech commented 3 years ago

Yes, gzip fastq compatibility would be very useful for me as well.

muellan commented 3 years ago

Just to let you all know that the next version of Metacache does support reading gzipped sequence files. Since it also contains a large portion of new code for accelerating builds and querying it might take a few weeks until we will release it.

jdwinkler-lanzatech commented 3 years ago

Nice, thanks!

jfy133 commented 3 years ago

Is there any ETA on when the new release with the gzipped version will be out?

It would also be important for incorporation into my workflows as well.

muellan commented 3 years ago

We currently have a paper under review. We will make the code of the latest version which also supports reading gzipped files (and many more capabilities) available as soon as the paper is accepted (fingers crossed). Unfortunately we don't have the time to back-port the reading of compressed files to an older version at the moment. So it will likely take a few weeks until we can make the newest version public.

jfy133 commented 3 years ago

No problem, good to know paper is under review! Good luck, and looking forward to it!

muellan commented 3 years ago

Reading gzipped files is now supported in the latest release!

jfy133 commented 3 years ago

Woohoo thank you!