rrwick / Porechop

adapter trimmer for Oxford Nanopore reads
GNU General Public License v3.0
322 stars 123 forks source link

optional compression? #28

Closed jvolkening closed 6 years ago

jvolkening commented 6 years ago

It seems that all demultiplexed output is gzip-compressed by default. Is there a way to turn off compression on these output files?

wdecoster commented 6 years ago

https://github.com/rrwick/Porechop#output

jvolkening commented 6 years ago

https://github.com/rrwick/Porechop#output

Yes, but this does not seem to work as described (at least in the latest stable release). Demultiplexed output files are always gzip-compressed, even when providing an uncompressed FASTA input and/or specifying --format fasta on the command line.

rrwick commented 6 years ago

Hi Jeremy,

I've been playing around with this one and I can't reproduce it. Here's an example input file: test_barcodes.fasta.txt (I had to add the .txt to the end to make Github accept it as an attachment). When I run this command: porechop -i test_barcodes.fasta -b test_barcodes_output I get uncompressed fasta files in the output directory. Can you try it with the same file and see what you get?

And off topic, but are you the Jeremy Volkening that went to Verona Area High School and played trumpet in the wind ensemble? If so, I'm the Ryan Wick of the same school/band/instrument, a year or two below you! How's it going?! If you're not that Jeremy... well, never mind 😄

Ryan

jvolkening commented 6 years ago

And off topic, but are you the Jeremy Volkening that went to Verona Area High School and played trumpet in the wind ensemble?

That's me. I remember you now, although I hadn't made the connection with the name at first. What are the chances of running into another Verona grad (from almost 20 yrs ago!) when working on MinION tools? I'm sure I'll be back here soon, as a group I'm working with is jumping headfirst into the nanopore platforms and they want porechop integrated into our cluster and Galaxy workflows.

When I run this command: porechop -i test_barcodes.fasta -b test_barcodes_output I get uncompressed fasta files in the output directory. Can you try it with the same file and see what you get?

I just tested your command and input with both the v0.2.1 tarball and the latest pull from Github - the release version produces gzip-compressed files and the latest commit produces uncompressed output as expected, so it seems this has already been changed/fixed.

This all comes about as part of an effort to get porechop into bioconda for use with Galaxy, and bioconda strongly prefers pulling from stable releases rather than the latest commit. I'll just wait for the next release, although given other issues (#29) there's no rush.