Open marwa38 opened 2 years ago
Hi @marwa38
I am just trying to understand exactly what you were trying to achieve here. Just to say, Trim Galore attempts to identify the adapter used in your system, by looking for exact matches either the Illumina standard, Nextera or Illumina small RNA adapters. If there is a clear 'winner', that adapter sequence is selected for removal (without any IUPAC ambiguity codes).
Also, the sequences you specified are lacking the A
from the A-tailing procedure. In other words, Trim Galore is not looking for these sequences, as they tend not to be read-through contaminations that inhibit read mapping, but often result from primer-dimers and the like. Such sequences do not align to any genome, so they get effectively pruned by the alignment process itself.
Is this the answer you were looking for?
Thank you for the discussion @FelixKrueger
Trim Galore attempts to identify the adapter used in your system,
Yeah I was having Nextera adaptors which is good trimGalore have removed.
Also, the sequences you specified are lacking the
A
from the A-tailing procedure.
I didn't mention any sequences in trimGalore code
trim_galore -q 30 --phred33 --paired --gzip --fastqc --fastqc_args "--nogroup" -o ./trimGalored --length 20 --stringency 1 -e 0 --clip_R1 17 --clip_R2 21 *_S"$i"_L001_R1_001.fastq.gz *_S"$i"_L001_R2_001.fastq.gz
What do you mean regarding "A"? you mean that I can't use V3-V4 primers that I above mentioned in my cutadapt codes to remove primers? you mean that TrimGalore --clip_R1 17 --clip_R2 21
have already removed all primers?
I was referring the code snippets in the original post:
#Check if primers are still inside (N, W and H nucleotides are identified by cutadapt)
FWD <- "CCTACGGGNGGCWGCAG"
REV <- "GACTACHVGGGTATCTAATCC"
but at second glance these sequences to not appear to be related to adapters after all - so please just ignore the comment about A-tailing.
I suppose my question is: why do you want to remove primer sequences at all? Trim Galore is really only meant to remove read-through adapter contamination and poor sequences from the 3'-end. Do you have a setup that requires you to remove and filter a lot more, including biases at the starts as well as ends of sequences, adapter contamination plus primer sequences?
Thanks for your reply and sorry for responding after a week @FelixKrueger . I think that removing primers along with the adaptors will make my dataset cleaner as part of the quality control. Is it a problem to remove primers? I am following the primer removal pipeline from here https://benjjneb.github.io/dada2/ITS_workflow.html
Thank you M
I just took a look at the dada2 workflow for ITS amplicon sequencing. I have to admit that I really don't feel that I know enough about 16S sequencing to give much advice on what to do, but it would appear that trimming of the amplicon primer sequences seems necessary. This doesn't necessarily mean that you couldn't accomplish your goals with Trim Galore, but you would have to know exactly what you want to be doing. Maybe it would just be easier to follow their workflow instead?
Thanks for your comments Just for more clarification; I removed the Nexetra adaptors using trimGalore as I don't know their sequences but I removed the primers using cutadapt as I know their sequences
Hi team I know that trimgalore is a wrapper around cutadapt but I want to check what I ran is considered fine I ran trimgalore to remove adaptors and primers at 5' and 3' ends but then
I found that I need to check for primers within the sequences themselves and I found and I removed them Things are ok that I did my analysis in that way? I don't want to reran the analysis again but I want to make sure that this is fine.