wdecoster / NanoPlot

Plotting scripts for long read sequencing data
http://nanoplot.bioinf.be
MIT License
407 stars 48 forks source link

NanoPlot crashes #280

Closed NiklasDreyer closed 2 years ago

NiklasDreyer commented 2 years ago

Hi,

Thanks for what appears to be an awesome program.

NanoPlot repeatedly crashes for me and I can't get any quality information out on my reads.

I can't locate the error and tried reinstalling and updating as well. Installed with conda and run in conda environment on macOS.

Could I ask for some help understanding the error and a possible solution? Best, Niklas

log file

2021-11-12 17:42:26,910 NanoPlot 1.32.1 started with arguments Namespace(N50=True, alength=False, bam=None, barcoded=False, color='yellow', colormap='Greens', cram=None, downsample=None, dpi=100, drop_outliers=False, fasta=None, fastq=['m54057_190926_040405.Q20.fastq'], fastq_minimal=None, fastq_rich=None, feather=None, font_scale=1, format='png', hide_stats=False, huge=False, listcolormaps=False, listcolors=False, loglength=False, maxlength=10000, minlength=50, minqual=None, no_N50=False, no_supplementary=False, outdir='/Users/niklasdreyer/Desktop/nanoplot', path='/Users/niklasdreyer/Desktop/nanoplot/plot', percentqual=False, pickle=None, plots=['kde', 'dot'], prefix='plot', raw=True, readtype='1D', runtime_until=None, store=False, summary=None, threads=2, title='CCS_raw_Q20_Facetotecta', tsv_stats=False, ubam=None, verbose=False) 2021-11-12 17:42:26,910 Python version is: 3.7.7 (default, Mar 26 2020, 10:32:53) [Clang 4.0.1 (tags/RELEASE_401/final)] 2021-11-12 17:42:26,912 NanoPlot: valid output format png 2021-11-12 17:42:26,939 Nanoget: Starting to collect statistics from plain fastq file. 2021-11-12 17:42:30,398 End of file without quality information. concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/concurrent/futures/process.py", line 239, in _process_worker r = call_item.fn(*call_item.args, *call_item.kwargs) File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/concurrent/futures/process.py", line 198, in _process_chunk return [fn(args) for args in chunk] File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/concurrent/futures/process.py", line 198, in return [fn(*args) for args in chunk] File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/site-packages/nanoget/extraction_functions.py", line 327, in process_fastq_plain data=[res for res in extract_from_fastq(inputfastq) if res], File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/site-packages/nanoget/extraction_functions.py", line 327, in data=[res for res in extract_from_fastq(inputfastq) if res], File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/site-packages/nanoget/extraction_functions.py", line 337, in extract_from_fastq for rec in SeqIO.parse(fq, "fastq"): File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/site-packages/Bio/SeqIO/Interfaces.py", line 73, in next return next(self.records) File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/site-packages/Bio/SeqIO/QualityIO.py", line 1080, in iterate for title_line, seq_string, quality_string in FastqGeneralIterator(handle): File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/site-packages/Bio/SeqIO/QualityIO.py", line 939, in FastqGeneralIterator raise ValueError("End of file without quality information.") ValueError: End of file without quality information. """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/site-packages/nanoplot/NanoPlot.py", line 66, in main keep_supp=not(args.no_supplementary)) File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/site-packages/nanoget/nanoget.py", line 94, in get_input dfs=[out for out in executor.map(extraction_function, files)], File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/site-packages/nanoget/nanoget.py", line 94, in dfs=[out for out in executor.map(extraction_function, files)], File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/concurrent/futures/process.py", line 483, in _chain_from_iterable_of_lists for element in iterable: File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/concurrent/futures/_base.py", line 598, in result_iterator yield fs.pop().result() File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/concurrent/futures/_base.py", line 435, in result return self.get_result() File "/Users/niklasdreyer/miniconda3/envs/nanoplot/lib/python3.7/concurrent/futures/_base.py", line 384, in get_result raise self._exception ValueError: End of file without quality information.

exempt from the fastq file, which does not seem to be corrupt. Runs fine with other scripts

@m54057_190926_040405/4194389/ccs GCAATGAAGTCGCAGGGTTGGGATCAGTTGATTCTGACCAACCGTCCAGCACTCGACATGTTCCAGATCGTCCTCGTCGCCGCCCTCGTGGCCTCCGCTGCCGCCGCCAACACTGGCTACAGGGCCCCTTCATACGGCCATGCCGCCGCCCCCGCCTACGGAGGCCACCGTGCCGCCGGATACGGTTACGGCGCCGACTACTATGAGGAGCCCAACTACAACTACGAGTACGCCGTGAAGGACGACTACAGCGGCAACGACTTCGGCCAGACTGAGTACCGTGACGGATACAGCACCAAGGGATCCTACTCCGTCGCCCTCCCCGACGGCCGCACCCAGACCGTCACCTACGCCGATGATGGCTACGGTCTCGTTGCCGACGTCTCCTACTACGGCGAGGCCCGCTACGACAGCTACGGACACGGATATGCTGCCCCATCTTACGGCTACAAGGCCGCTCCCTCTTATGGCCATGCCCGTCCTTCCTACGGATACAAGGCTGCTCCTTCTTATGGATACAAGGCCGCTCCCTCTTACGGATACAGGGCTGCCGCTCCCTCTTACGGGTACAGGGCTGCCGCTCCTTCTTATGGCTACAGAGCCGCCCCCGCTTACGGATATAGCCGTCCCTCTTATGGATACAGCCGTCCCTCCTACGGATACGGATACAGCCGTCCTTCTTACGGATACAGAGCCGCTGCTCCCGCTTACGGCAAGCACTAAGTAGAGACATGAACAAACCCGTCTTCTTTCTTCTCTCATACCATCTGGCTATCGCAATTTGATTGTGCTCTCCTACTTTACCCATCCGCATTCAGCTCATTCCAACTTTCCACAGACATGCACGCACACCTCAGTTTCACACCCTCTCTCTTATCTGTAAATTACAGATACTAAGTCAGTTTCTTCATATTTATCTGATATTTTAACTACTATTTATCCTGTTGTACGTCAATAAACGTTTTGTAGTGAAAAAAAAAAAAAAAAAAAAAAAAAAGTACTCTGCGTTGATACCACTGCTT +

wdecoster commented 2 years ago

Hi,

Thanks for reporting this. The error comes from Biopython, the module I use to parse fastq files. It does suggest something is wrong with your input file. Is that a sequence without a phred quality string in your example?

Wouter

NiklasDreyer commented 2 years ago

Thanks for the fast reply. I re-installed BioPython and ran pip install pip install NanoPlot --upgrade

I double-checked my fastq files, both ccs and isoseq. Indeed, the last sequence of each file had no qualityscores assigned to them. I should have checked that but thank your fast reply.

Take care Niklas