wdecoster / NanoPlot

Plotting scripts for long read sequencing data
http://nanoplot.bioinf.be
MIT License
407 stars 48 forks source link

OSError: not a gzip file #323

Closed tann5er closed 1 year ago

tann5er commented 1 year ago

Hi there.

I'm trying to run NanoPlot on a dataset but I'm getting this error:

concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/apps/chpc/bio/anaconda3-2020.02/envs/NanoPlot/lib/python3.7/concurrent/futures/process.py", line 239, in _process_worker r = call_item.fn(*call_item.args, *call_item.kwargs) File "/apps/chpc/bio/anaconda3-2020.02/envs/NanoPlot/lib/python3.7/concurrent/futures/process.py", line 198, in _process_chunk return [fn(args) for args in chunk] File "/apps/chpc/bio/anaconda3-2020.02/envs/NanoPlot/lib/python3.7/concurrent/futures/process.py", line 198, in return [fn(*args) for args in chunk] File "/apps/chpc/bio/anaconda3-2020.02/envs/NanoPlot/lib/python3.7/site-packages/nanoget/extraction_functions.py", line 388, in process_fastq_rich for record in SeqIO.parse(inputfastq, "fastq"): File "/apps/chpc/bio/anaconda3-2020.02/envs/NanoPlot/lib/python3.7/site-packages/Bio/SeqIO/Interfaces.py", line 74, in next return next(self.records) File "/apps/chpc/bio/anaconda3-2020.02/envs/NanoPlot/lib/python3.7/site-packages/Bio/SeqIO/QualityIO.py", line 1085, in iterate for title_line, seq_string, quality_string in FastqGeneralIterator(handle): File "/apps/chpc/bio/anaconda3-2020.02/envs/NanoPlot/lib/python3.7/site-packages/Bio/SeqIO/QualityIO.py", line 961, in FastqGeneralIterator for line in handle: File "/apps/chpc/bio/anaconda3-2020.02/envs/NanoPlot/lib/python3.7/gzip.py", line 300, in read1 return self._buffer.read1(size) File "/apps/chpc/bio/anaconda3-2020.02/envs/NanoPlot/lib/python3.7/_compression.py", line 68, in readinto data = self.read(len(byte_view)) File "/apps/chpc/bio/anaconda3-2020.02/envs/NanoPlot/lib/python3.7/gzip.py", line 474, in read if not self._read_gzip_header(): File "/apps/chpc/bio/anaconda3-2020.02/envs/NanoPlot/lib/python3.7/gzip.py", line 422, in _read_gzip_header raise OSError('Not a gzipped file (%r)' % magic) OSError: Not a gzipped file (b'@7')

Any help will be appreciated.

wdecoster commented 1 year ago

The error comes from gzip and biopython, both used to parse your fastq file. It seems your fastq file has a .gz extension, but could it be that it is actually not compressed? What do you get with the file command on your input file, e.g. file reads.fastq.gz?