FelixKrueger / TrimGalore

A wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data
GNU General Public License v3.0
472 stars 151 forks source link

Adapter Trimming on PE reads have different number of reads output #186

Open bshim181 opened 8 months ago

bshim181 commented 8 months ago

Hello,

trim_galore -o output --fastqc --paired $R1_file $R2_file

I have utilized trim-galore to trim illumina adapters on my PE sequencing reads. When I tried to process the output fq files through bbmerge for insert length analysis, I have received this error message from bbmerge. "There appear to be different numbers of reads in the paired input files. The pairing may have been corrupted by an upstream process."

So when I checked the length of trim-galore outputs for one of the samples, output from R1 and R2 differed. Would I have to specify retained_unpaired for these to match up? My assumption was that if I run trim-galore with paired parameters, the pairings would be retained.

314343816 Sample1_R1_001_val_1.fq 299531972 Sample1_R2_001_val_2.fq

FelixKrueger commented 8 months ago

This is unusual, I would say. Trim Galore has functionality built in that it would die (i.e. terminate) if R1 and R2 input files are truncated, and sequence pairs are always handled together. There is no need to specify anything else.

Is there a chance that the files got corrupted during a copying process of the like? Are they gzip compressed, or are you showing the number of lines in the example above? Do you still have the log file?