dcjones / quip

Compressing next-generation sequencing data with extreme prejudice.
http://www.cs.washington.edu/homes/dcjones/quip/
BSD 3-Clause "New" or "Revised" License
78 stars 10 forks source link

File compresses with -a fails for a particular fastq file #12

Closed vedarethinar closed 11 years ago

vedarethinar commented 11 years ago

During our test on quip compression of 16 fastq files, one file failed. The quip application hangs there. Do we have to take note of anything during the usage of -a during compression.

dcjones commented 11 years ago

It definitely should not hang during compression for any reason. Can you post a few entries from the file you are compressing? That may contain some clues.

DarwinAwardWinner commented 11 years ago

I've run into this problem a few times. Quip successfully compresses all the fastq files in a dataset except one, which hangs for over a day. I usually solve it by just re-running it with no -a option on just that one.

dcjones commented 11 years ago

Ok, I'm eager to debug it, but I haven't reproduced it yet. Any help (e.g. pointing me to a dataset where it fails) would be appreciated.

DarwinAwardWinner commented 11 years ago

Ok, I think I can find the problematic file and see if it consistently hangs. If so, do you have somewhere I can upload it? I'll see if I can find a public upload spot on my end.

DarwinAwardWinner commented 11 years ago

I've put a minimal example that triggers the bug on my Dropbox account. It is compressed using quip without assembly. It consists of the first 2500000 reads in the problematic file, since that is the default number used in assembly. The reduced file seems to also hang during assembly.

http://dl.dropbox.com/u/1581949/brokenquipsmall-noassembly.fastq.qp

dcjones commented 11 years ago

Awesome, thanks! I'll get this sorted out tomorrow.

dcjones commented 11 years ago

Thanks again for the help. I found the bug. It's trivial to fix, but I need to do so carefully to preserve backwards compatibility. Look for a new version in the next few days.

dcjones commented 11 years ago

Fixed now in version 1.1.4. (Source tarball)