Open martina-nasuelli opened 3 months ago
I would be a bit concerned about the fraction of reads you are losing in the chimera removal step. I'm not familiar with these primers and library preparation -- is it true that the reverse primer is 26nts long and is sequenced, but that the forward primer is not sequenced? That is what it looks like based on your filterAndTrim(..., trimLeft=c(0,26))
choice.
In addition, why have you chosen mergePairs(..., returnRejects=TRUE)
? This is really only intended for troubleshooting, not as part of a normal workflow. removeBimeraDenovo(..., allowOneOff=TRUE)
is also not a recommended option anymore.
I am not an expert on CO1 metabarcoding, but I would start by BLAST-ing some of your top ASVs against a broad database like nt, and comparing the best hits to what you are getting from taxonomic assignment against MIDORI. Do they mostly match up? Or is there an obvious mismatch even among the abundant ASVs?
Hello Benjamin, I've rerun the dada2 script based on your suggestions, adding the Trimleft option for the forward primer (26bp). I've obtained a higher percentage of non-chimeras (0.9928853), but still got no taxonomy for my ASV. Yes, I've blasted the generated sequences. The primers I used are meant to amplify the CO1 gene for Metazoan; most of the best matches I obtain from BLAST, in terms of query over and similarity, are Fungi. I am referring to 98% query cover and 90% similarity, which is very low. Some Metazoan are returned with 87-88% of similarity.
Thank you again
Martina Nasuelli, PhD student Università del Piemonte Orientale Dipartimento per lo Sviluppo Sostenibile e la Transizione Ecologica (DiSSTE) Piazza Sant'Eusebio 5, 13100 - Vercelli, Italy
Google Scholar https://scholar.google.com/citations?hl=it&view_op=list_works&gmla=AJsN-F7yLKtw4EQIsjnfDMRSWzFmyF4E48O6LKKYsKLNV08MRAaezbKyZIVeZwdEvgWSd4-IHaEipDdykvftRsRxTOoB9XBWScIzsMYs5xaa-k21Fme9Sotb2Efe7Dt6_RfK1K9ey5WisciHR7W4GvHjF8Ct-gPvbA&user=NsdvzOIAAAAJ ResearchGate https://www.researchgate.net/profile/Martina-Nasuelli-2 Twitter https://twitter.com/MNasuelli
Il giorno gio 13 giu 2024 alle ore 22:15 Benjamin Callahan < @.***> ha scritto:
I would be a bit concerned about the fraction of reads you are losing in the chimera removal step. I'm not familiar with these primers and library preparation -- is it true that the reverse primer is 26nts long and is sequenced, but that the forward primer is not sequenced? That is what it looks like based on your filterAndTrim(..., trimLeft=c(0,26)) choice.
In addition, why have you chosen mergePairs(..., returnRejects=TRUE)? This is really only intended for troubleshooting, not as part of a normal workflow. removeBimeraDenovo(..., allowOneOff=TRUE) is also not a recommended option anymore.
I am not an expert on CO1 metabarcoding, but I would start by BLAST-ing some of your top ASVs against a broad database like nt, and comparing the best hits to what you are getting from taxonomic assignment against MIDORI. Do they mostly match up? Or is there an obvious mismatch even among the abundant ASVs?
— Reply to this email directly, view it on GitHub https://github.com/benjjneb/dada2/issues/1971#issuecomment-2166686311, or unsubscribe https://github.com/notifications/unsubscribe-auth/BJEVCRCDKICUZHZTR3AO5CLZHH4UHAVCNFSM6AAAAABJHZL2QWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRWGY4DMMZRGE . You are receiving this because you authored the thread.Message ID: @.***>
The primers I used are meant to amplify the CO1 gene for Metazoan; most of the best matches I obtain from BLAST, in terms of query over and similarity, are Fungi. I am referring to 98% query cover and 90% similarity, which is very low. Some Metazoan are returned with 87-88% of similarity.
One possibility is that these primers are not performing as expected, and so you are getting lots of off-target amplification. That would be my first guess based on what you describe here. It may also be worth looking at the alignments for your top hits --- are the mismatches clustered to a specific part of the reads? If so, it could be some technical issue that is solvable. If not, it is consistent with the non-specific priming idea.
Hello Ben and team, I am having trouble with the taxa assignation using dada2 and the MIDORI reference database. I've sequenced a Metazoan COI library based on Leray et al. 2013 primers targeting an amplicon of 313 bp. Based on the quality profile plots (attached), I've set these filters and parameters to get the taxa assignation: out <- filterAndTrim(fnFs, filtFs, fnRs, filtRs, trimLeft = c(0,26), truncLen = c(240,180), truncQ = 5, maxN = 0, maxEE = c(2,2), rm.phix=TRUE, compress=TRUE, multithread=TRUE)
errF <- learnErrors(filtFs, multithread=TRUE) errR <- learnErrors(filtRs, multithread=TRUE)
mergers <- mergePairs(dadaFs, filtFs, dadaRs, filtRs, verbose=TRUE, minOverlap = 12, returnRejects = TRUE)
seqtabfilt.nochim <- removeBimeraDenovo(seqtabfilt, method="consensus", allowOneOff = TRUE, multithread=TRUE, verbose=TRUE)
taxa.MIDORICOI <- assignTaxonomy(seqtabfilt.nochim, "../MIDORI2_UNIQ_NUC_GB259_CO1_DADA2.fasta.gz", tryRC = TRUE, taxLevels = c("Phylum","Class","Order","Family","Genus","Species"), minBoot = 50, multithread=TRUE)
After these steps, the taxa assignation I've got is mostly NA or no assignation lower than Order. In the attached files you can find the quality profile plots of the R1 and R2 fastqs and the taxa and count csv I've obtained. What I am missing here? Is there something I have done wrong in the library preparation or I am analyzing the data incorrectly? COI_analysis.zip
Thank you in advance,
Martina Nasuelli