bhattlab / MGEfinder

A toolbox for identifying mobile genetic element (MGE) insertions from short-read sequencing data of bacterial isolates.
MIT License
102 stars 16 forks source link

cutadapt: error: FASTQ file ended prematurely Cutadapt terminated with exit signal: '256'. #45

Closed qianxin-kxy closed 1 year ago

qianxin-kxy commented 1 year ago

Hello, a very meaningful tool. I have two questions to ask:

  1. Can I use Fastp software instead of SuperDeduper and trim galore to process raw sequencing data to obtain clean data for subsequent analysis? Because the company I sent for sequencing uses Fastp and processes the raw data using the following steps: (1) Discard a paired reads if either one read contains adapter contamination;
 (2) Discard a paired reads if more than 10% of bases are uncertain in either one read; (3) Discard a paired reads if the proportion of low quality (Phred quality <5) bases is over 50% in either one read.

2.I encountered the following problem when using the trim-gallore tool to process data. I don't know how to handle it? (trim-galore) [KXY@zju 673]$ trim_galore --fastqc --paired 673.nodup_R1.fastq.gz 673.nodup_R2.fastq.gz --cores 8 Path to Cutadapt set as: 'cutadapt' (default) Cutadapt seems to be working fine (tested command 'cutadapt --version') Cutadapt version: 1.18 Could not detect version of Python used by Cutadapt from the first line of Cutadapt (but found this: >>>#!/bin/sh<<<) Letting the (modified) Cutadapt deal with the Python version instead pigz 2.6 Parallel gzip (pigz) detected. Proceeding with multicore (de)compression using 8 cores

Proceeding with 'pigz -p 4' for decompression To decrease CPU usage of decompression, please install 'igzip' and run again

No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default)

AUTO-DETECTING ADAPTER TYPE

Attempting to auto-detect adapter type from the first 1 million sequences of the first file (>> 673.nodup_R1.fastq.gz <<)

Found perfect matches for the following adapter sequences: Adapter type Count Sequence Sequences analysed Percentage Illumina 905 AGATCGGAAGAGC 1000000 0.09 Nextera 1 CTGTCTCTTATA 1000000 0.00 smallRNA 0 TGGAATTCTCGG 1000000 0.00 Using Illumina adapter for trimming (count: 905). Second best hit was Nextera (count: 1)

Writing report to '673.nodup_R1.fastq.gz_trimming_report.txt'

SUMMARISING RUN PARAMETERS

Input filename: 673.nodup_R1.fastq.gz Trimming mode: paired-end Trim Galore version: 0.6.10 Cutadapt version: 1.18 Python version: could not detect Number of cores used for trimming: 8 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp Running FastQC on the data once trimming has completed Output file(s) will be GZIP compressed

Cutadapt seems to be reasonably up-to-date. Setting -j 8 Writing final adapter and quality trimmed output to 673.nodup_R1_trimmed.fq.gz

Now performing quality (cutoff '-q 20') and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file 673.nodup_R1.fastq.gz <<< This is cutadapt 1.18 with Python 3.7.12 Command line parameters: -j 8 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC 673.nodup_R1.fastq.gz Processing reads on 8 cores in single-end mode ... ERROR: Traceback (most recent call last): File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/cutadapt/pipeline.py", line 412, in reader_process pipe.send_bytes(chunk) File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 88, in exit self.close() File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 215, in close self._raise_if_error() File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 231, in _raise_if_error raise IOError(message) OSError

ERROR: Traceback (most recent call last): File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/cutadapt/pipeline.py", line 412, in reader_process pipe.send_bytes(chunk) File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 88, in exit self.close() File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 215, in close self._raise_if_error() File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 231, in _raise_if_error raise IOError(message) OSError

ERROR: Traceback (most recent call last): File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/cutadapt/pipeline.py", line 412, in reader_process pipe.send_bytes(chunk) File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 88, in exit self.close() File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 215, in close self._raise_if_error() File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 231, in _raise_if_error raise IOError(message) OSError

ERROR: Traceback (most recent call last): File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/cutadapt/pipeline.py", line 412, in reader_process pipe.send_bytes(chunk) File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 88, in exit self.close() File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 215, in close self._raise_if_error() File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 231, in _raise_if_error raise IOError(message) OSError

ERROR: Traceback (most recent call last): File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/cutadapt/pipeline.py", line 412, in reader_process pipe.send_bytes(chunk) File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 88, in exit self.close() File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 215, in close self._raise_if_error() File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 231, in _raise_if_error raise IOError(message) OSError

ERROR: Traceback (most recent call last): File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/cutadapt/pipeline.py", line 412, in reader_process pipe.send_bytes(chunk) File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 88, in exit self.close() File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 215, in close self._raise_if_error() File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/xopen/init.py", line 231, in _raise_if_error raise IOError(message) OSError

ERROR: Traceback (most recent call last): File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/cutadapt/pipeline.py", line 486, in run (n, bp1, bp2) = self._pipeline.process_reads() File "/data/users/KXY/miniconda3/envs/trim-galore/lib/python3.7/site-packages/cutadapt/pipeline.py", line 230, in process_reads for read in self._reader: File "src/cutadapt/_seqio.pyx", line 176, in iter cutadapt.seqio.FormatError: FASTQ file ended prematurely

cutadapt: error: FASTQ file ended prematurely

Cutadapt terminated with exit signal: '256'. Terminating Trim Galore run, please check error message(s) to get an idea what went wrong...

durrantmm commented 1 year ago

Hello!

Yes, you can clean your reads however you would like. Fastp or BB-suite should work, I'd always double check your read quality using a tool like fastqc.