Closed bounlu closed 2 years ago
If the intention is to remove an additional couple of bases if an adapter was found, then you can just add the appropriate number of N wildcards characters to the adapter sequence. So if you would normally search for -a ACGT
and want to remove the three bases preceding the match, use -a NNNACGT
instead. For a larger number of N, you can use the curly brace notation and provide an explicit number: -a N{8}ACGT
(N{8}
is the same as writing N
eight times).
Great, that would work for me, thank you.
Just wondering one thing. Are these two equivalent?
cutadapt -a NNNACGT
cutadapt -a ACGT | cutadapt -u -3 -
It depends on how --overlap
is set.
If --overlap=3
is used (the default), then these commands behave similarly. The NNNACGT
sequence matches the 3' end of every read because the three Ns fulfill the minimum overlap criterion, so no matter how the read ends, at least three bases are removed. So then the two commands work similarly.
If you have a larger overlap, let’s say --overlap=6
, then the first command will find a match in a read that ends in ...ACG and remove it and the three bases preceding it, but if there’s no match, the read would remain untrimmed.
Is there a way to remove bases from 3' end AFTER adapter trimming?
Like
--three_prime_clip_r1
parameter intrim_galore
which is needed for Nextflex library.