marcelm / cutadapt

Cutadapt removes adapter sequences from sequencing reads
https://cutadapt.readthedocs.io
MIT License
502 stars 126 forks source link

Error in FASTQ file at line 20136: Length of sequence and qualities differ #786

Closed Missthepast closed 1 month ago

Missthepast commented 1 month ago

I'm using version 2.9 and install with conda. The commands used are as follows: /script/conda_lib/bin/cutadapt -a GAGGAG -A AGAAAG -g CCGATCT -G CTCTAT -n 2 --trim-n --max-n 5 -q 10 -m 35 -j 15 -o 1_cut.fq.gz -p 2_cut.fq.gz 1.fq.gz 2.fq.gz --report=full >cutadapt.log ERROR: ERROR: Traceback (most recent call last): File "/script/conda_lib/lib/python3.7/site-packages/cutadapt/pipeline.py", line 529, in run (n, bp1, bp2) = self._pipeline.process_reads() File "/script/conda_lib/lib/python3.7/site-packages/cutadapt/pipeline.py", line 364, in process_reads for read1, read2 in self._reader: File "/script/conda_lib/lib/python3.7/site-packages/dnaio/init.py", line 266, in iter r2 = next(it2) File "src/dnaio/_core.pyx", line 250, in fastq_iter dnaio.exceptions.FastqFormatError: Error in FASTQ file at line 20136: Length of sequence and qualities differ I use this line of command to execute other fastq.gz files without errors, as well as check the 20136 line, the sequence and the quality fraction length are consistent, I would like to ask why this is Z5 P078GB}9VKYB(CW$9B3S

marcelm commented 1 month ago

Hi, the line number is probably incorrect (this is a known bug). The problem is probably somewhere later in the file. I think you may get the correct line number if you run Cutadapt with only one thread -j 1.

But to be clear: I’m quite sure that there is a broken FASTQ record in the input file. This often happens when the file is truncated, so I suggest you check the very last lines in the file manually.

Missthepast commented 1 month ago

Thank you very much, it's true that there is a problem at the end of the data