FelixKrueger / TrimGalore

A wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data
GNU General Public License v3.0
461 stars 150 forks source link

(new) problem with output having reads that don't have a mate in R1 and R2 files. #143

Closed bmillerlab closed 7 months ago

bmillerlab commented 1 year ago

Hello, I used trim-galore last year to clean up a lot of fastq.gz files I received from my service provider. It worked great and I was able to use the files for downstream analysis with no problem. I just tried using the same commands on newly received fastq.gz files from the same library prep (nextera) and the same service provider and repeated the cleanup with the same commands:

trim_galore --cores 8 --paired -o galore_output --length 50 --nextera -a2 GTGTAGAGCC -q 25 --fastqc [long list of files names here]

The fastqc.html looks fine (actually I used multiqc to look at the aggregate data), the adapters are removed and poor quality sequence is trimmed . However, this time I am getting a subset of sequences that no longer pair in the R1 and R2 files for my downstream analysis which is causing it to fail. I used geneious prime to test several pairs of files by trying to merge them and was able to confirm that all of them have unpaired reads where trim-galore left different sequences in R1 versus R2. I went back and tested the same files before they were "cleaned", slow process because they are large (still have low qc sequences present), in the same way and they all pair perfectly - so the problem definitely arose during the processing.

Could you suggest where the problem may be arising so that I can try and fix the files? I did update to the newest TrimGalore version, should I rollback to an older version if this a new issue?

Thank you for any suggestions you may have.

FelixKrueger commented 1 year ago

Could you provide a small example command and data set that would be enough to trigger the the issue? Since there is another 'service provider' involved I am not sure this has to do with Trim Galore in the first place...

FelixKrueger commented 1 year ago

As a test, I have just run the exact command above on a test set of files, and it produces trimmed files with the exact same number of reads, as expected.

FelixKrueger commented 7 months ago

Closing this as there was no communication for some time.