FelixKrueger / TrimGalore

A wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data
GNU General Public License v3.0
459 stars 149 forks source link

Trim galore gets stuck on quality and adapter trimming #178

Open aman-akash opened 10 months ago

aman-akash commented 10 months ago

Hey,

I am trying to trim pacbio sequencing data in a fastq file using trim galore. However, in every run trim galore gets stuck on the trimming step and never goes to the next step. I can see the trimmed file is created and has a relevant size as well. When I checked the run status I saw that the trim galore job had been suspended for over a day.

Any help regarding this would be great. Thank you for making a nice tool.

Cheers, Aman

Command : trim_galore -q 7 --fastqc --length 1000 -j 4 -o trimmed ../SRR16*****.fastq

I am also attaching the report generated.

SUMMARISING RUN PARAMETERS
==========================
Input filename: ../SRR16******.fastq
Trimming mode: single-end
Trim Galore version: 0.6.10
Cutadapt version: 4.5
Python version: could not detect
Number of cores used for trimming: 4
Quality Phred score cutoff: 7
Quality encoding type selected: ASCII+33
Using Nextera adapter for trimming (count: 468). Second best hit was smallRNA (count: 329)
Adapter sequence: 'CTGTCTCTTATA' (Nextera Transposase sequence; auto-detected)
Maximum trimming error rate: 0.1 (default)
Minimum required adapter overlap (stringency): 1 bp
Minimum required sequence length before a sequence gets removed: 1000 bp
Running FastQC on the data once trimming has completed
FelixKrueger commented 10 months ago

I am not sure I can be of great help here as I have never tried to trim PacBio reads before. Trim Galore is specifically meant to be a short-read trimmer with a focus on Illumina read data. I am not sure if Cutadapt itself is capable of trimming PacBio reads, but maybe there is software out there that is specifically designed to work with it?

aman-akash commented 10 months ago

Read that it worked for some people even with pacbio reads.

Thanks for the clarification.

Cheers.

FelixKrueger commented 10 months ago

hmm, maybe it does work but it just takes a long time? I would probably first try it out with a small subset to see if it works in principle, and if it does through some more parallel compute power at it (via -j). Just monitor the system performance (e.g via top) to make sure that it is still running....