jdidion / atropos

An NGS read trimming tool that is specific, sensitive, and speedy. (production)
Other
120 stars 15 forks source link

gzip error code -9, different than issue #62? #78

Open pkMyt1 opened 5 years ago

pkMyt1 commented 5 years ago

This appears to have started with 1.1.19 and I am still getting it with 1.1.21. Originally I though it was an issue with the cluster I use on my end but it doesn't look like it. Here is part of the log with the error. There are about 378 million lines in these FASTQ files. The program fails around 32 - 40 million. Varies with each test. gzip -t reports no issues with the files. I have trimmed this file before with 1.1.15. The config file is attached. 18001_CTGTAGCC_Atropos_Config.txt

2018-12-17 16:01:42,570 ERROR: Worker process 1 waiting on batch for 60.0 seconds 2018-12-17 16:01:42,614 ERROR: Result process waiting on result for 60.1 seconds 2018-12-17 16:06:46,813 ERROR: Result process waiting on result for 67.6 seconds 2018-12-17 16:06:59,233 ERROR: Result process waiting on result for 80.6 seconds 2018-12-17 16:07:04,378 ERROR: Result process waiting on result for 86.0 seconds 2018-12-17 16:08:45,894 ERROR: Atropos error Traceback (most recent call last): File "/nas/longleaf/home/dennis/Ljosalfheim/lib/python3.5/site-packages/atropos/commands/base.py", line 279, in next read_index, record = next(self.iterable) File "/nas/longleaf/home/dennis/Ljosalfheim/lib/python3.5/site-packages/atropos/io/seqio.py", line 448, in iter read2 = next(it2) File "atropos/io/_seqio.pyx", line 199, in iter File "/nas/longleaf/home/dennis/Ljosalfheim/lib/python3.5/site-packages/atropos/io/compression.py", line 128, in read self._raise_if_error() File "/nas/longleaf/home/dennis/Ljosalfheim/lib/python3.5/site-packages/atropos/io/compression.py", line 121, in _raise_if_error "input file truncated or corrupt?".format(retcode)) EOFError: gzip process returned non-zero exit code -9. Is the input file truncated or corrupt?

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/nas/longleaf/home/dennis/Ljosalfheim/lib/python3.5/site-packages/atropos/util/init.py", line 727, in run_interruptible func(*args, **kwargs) File "/nas/longleaf/home/dennis/Ljosalfheim/lib/python3.5/site-packages/atropos/commands/multicore.py", line 291, in call self.ensure_alive) File "/nas/longleaf/home/dennis/Ljosalfheim/lib/python3.5/site-packages/atropos/commands/multicore.py", line 507, in enqueue_all for item in iterable: File "/nas/longleaf/home/dennis/Ljosalfheim/lib/python3.5/site-packages/atropos/commands/base.py", line 286, in next self.finish() File "/nas/longleaf/home/dennis/Ljosalfheim/lib/python3.5/site-packages/atropos/commands/base.py", line 358, in finish self.reader.close() File "/nas/longleaf/home/dennis/Ljosalfheim/lib/python3.5/site-packages/atropos/io/seqio.py", line 463, in close self.reader1.close() File "/nas/longleaf/home/dennis/Ljosalfheim/lib/python3.5/site-packages/atropos/io/seqio.py", line 92, in close self._file.close() File "/nas/longleaf/home/dennis/Ljosalfheim/lib/python3.5/site-packages/atropos/io/compression.py", line 105, in close self._raise_if_error() File "/nas/longleaf/home/dennis/Ljosalfheim/lib/python3.5/site-packages/atropos/io/compression.py", line 121, in _raise_if_error "input file truncated or corrupt?".format(retcode)) EOFError: gzip process returned non-zero exit code -15. Is the input file truncated or corrupt? 2018-12-17 16:10:22|INFO|Paired End Mode

jdidion commented 5 years ago

Thanks, and sorry for the delay in addressing this. Can you provide a minimal fastq file (or pair of files) that replicates the issue?

jdidion commented 4 years ago

@pkMyt1 it would be great if you could give this another try with a 2.x release (if you are able to upgrade to python 3.6+). The code that deals with compressed files in Atropos has been replaced with the xphyle library, which should be more stable, and easier to debug if an error still occurs.