Closed Maarten-vd-Sande closed 5 years ago
Hi Maarten,
I am afraid as it stands Trim Galore is not set up towards not trimming at all.
One option would of course be do the checking externally, e.g. I believe FastQC writes out a table with the counts for the Illumina, Nextera and small RNA adapters anyway, so one could in theory read that and base the trimming/no trimming decision on that.
We could of course add an option that skips the adapter trimming (or sets -a X
itself) if really absolutely no adapter is found. The question would then be though: Where do you draw the line? At really 0 / 0 / 0 counts for all three adapters? What happens if there is 1 count for one of them? Just to remind you, the auto-detection works on the first 1 million sequences, and the adapter sequences as 12-13bp long, so very occasionally you might find one of the sequences in a read by chance, or if it occurs in the genome somewhere....
I could of course leave that thresholding problem up to you by adding an integer threshold count that you need to set yourself (anything between 0 and ?), e.g. --consider_already_trimmed [INT]
. Is that something that would help in case?
Thanks for the fast reply.
I would be very happy with the solution you propose, and leaving the 'thresholding problem' to me/the user.
Alright, let me see what I can do (probably tomorrow though).
No hurries, thanks a lot!
Sorry for being not entirely truthful, I have now tried to add the option --consider_already_trimmed INT
to do pretty much exactly what we discussed. Could you clone the latest development version and see if it works in your hands? Addressed here: 06622790e6a0cbd132159fa6c5219315724ff955.
A description should come up with trim_galore --help
.
Can't complain! It seems to work as intended (tried on two samples, one above threshold and one below).
Thanks a lot
p.s. any idea when this will be 'released'?
Thanks for the feedback. Given that there are some four changes already I suppose we could make a release soon. Let me just grab a coffee...
Here we go, v0.6.4 has just been posted. Enjoy!
https://github.com/FelixKrueger/TrimGalore/releases/tag/0.6.4
When no adapter is found, trim galore defaults to the illumina sequence and tries to remove this.
I have a pipeline that cuts automatically using trimgalore, however sometimes people send me data that is already trimmed. Since I do not want to make assumptions about whether the data is trimmed or not, I still run it through trimgalore. I was wondering if there is some option I failed to find, that if no adapter is found, quality control is performed but no adapter trimming.
I am trying to avoid manually checking if adapters exist and adding -a -X (#51)