DaehwanKimLab / centrifuge

Classifier for metagenomic sequences
GNU General Public License v3.0
247 stars 73 forks source link

.gz support for kreport script #21

Open ffinfo opened 8 years ago

ffinfo commented 8 years ago

It would be nice if the kreport script supports zipped input files. The output is now 1 line for each read/pair, this can become a very large file. Just piping from centrifuge to gzip is easy but kreport does not support this.

fbreitwieser commented 7 years ago

@ffinfo, you can pipe into centrifuge-kreport. That means one of the two following ways should work, depending on your shell:

gunzip -c centrifuge-out.gz | centrifuge-kreport -x IDX 
# or, using an in-place file (with bash)
centrifuge-kreport -x IDX <(gunzip -c centrifuge-out.gz)

Can you try that?

ffinfo commented 7 years ago

Oh good, going to try this. Well let you know if this works. Maybe good to add this to the usage of the script/tool?

ffinfo commented 7 years ago

The first one does not work because centrifuge-kreport does not read from stdin. The second one is basicly a inline fifo pipe, this does work.

For usability it's might be a good thing to let the tool also accept input from stdin and directly from a .gz file.

fbreitwieser commented 7 years ago

Fixed the input from STDIN. Will add the feature to read directly from a gzipped file in the future.