jdidion / atropos

An NGS read trimming tool that is specific, sensitive, and speedy. (production)
Other
120 stars 15 forks source link

Add option to disable use of system gzip #62

Open gmagoon opened 6 years ago

gmagoon commented 6 years ago

Hi John, I've just set up Atropos v. 1.1.17 on Ubuntu 16.04 via pip3 install atropos. I'm encountering issues like shown below when running basic commands with the error and detect modules. I've checked the input files withgzip -t and gzip -l and no problems show up (other than the known issue for the latter with files above 4GB). I've tried several fastq pairs and encountered the same error. Am I correct in suspecting that the -15 exit code corresponds to termination via SIGTERM? And if so, does this necessarily indicate a problem, if, for example, the termination signal came from atropos? I'm probably overlooking something that should be obvious, since it doesn't look like anyone else has reported seeing this error. Thanks in advance for any tips you can provide, Greg

~$ atropos error -pe1 R1_001.fastq.gz -pe2 R2_001.fastq.gz 2018-03-14 23:26:24,086 INFO: This is Atropos 1.1.17 with Python 3.5.2 2018-03-14 23:26:28,428 ERROR: Atropos error Traceback (most recent call last): File "/home/gmagoon/.local/lib/python3.5/site-packages/atropos/util/init.py", line 727, in run_interruptible func(*args, **kwargs) File "/home/gmagoon/.local/lib/python3.5/site-packages/atropos/commands/base.py", line 22, in call for batch in command_runner.iterator(): File "/home/gmagoon/.local/lib/python3.5/site-packages/atropos/commands/base.py", line 290, in next self.finish() File "/home/gmagoon/.local/lib/python3.5/site-packages/atropos/commands/base.py", line 358, in finish self.reader.close() File "/home/gmagoon/.local/lib/python3.5/site-packages/atropos/io/seqio.py", line 463, in close self.reader1.close() File "/home/gmagoon/.local/lib/python3.5/site-packages/atropos/io/seqio.py", line 92, in close self._file.close() File "/home/gmagoon/.local/lib/python3.5/site-packages/atropos/io/compression.py", line 105, in close self._raise_if_error() File "/home/gmagoon/.local/lib/python3.5/site-packages/atropos/io/compression.py", line 121, in _raise_if_error "input file truncated or corrupt?".format(retcode)) EOFError: gzip process returned non-zero exit code -15. Is the input file truncated or corrupt?

gmagoon commented 6 years ago

Just wanted to follow up on this, as I now have a better picture of what is going on. I noticed that the error module would sometimes actually intermittently work without error. I was also running a bunch of other processes at the time. Now that some of the processes have finished, both the error and detect modules seem to run reliably without error. I suspect one of my processes had a conflict, along the lines described here: https://unix.stackexchange.com/questions/71059/system-sending-sigterm-and-sigkill-during-normal-work . Since this seems to be a very esoteric problem specific to me, please feel free to close this issue.

jdidion commented 6 years ago

Thanks for reporting this. As you've probably figured out, Atropos tries to use system gzip for compression when available, because it's much faster than using the python gzip module. However, there should probably be a way to turn off this behavior when it causes problems like what you observe. I am going to update this issue to be a request for this enhancement.

gmagoon commented 6 years ago

sounds great...thanks very much for the additional info, John