Closed skelto3 closed 4 years ago
To clarify, the sequences linked here are the mock community sequences you are trying to recover? And, is it just these sequences being denoised, or are they part of a long sequenced region?
Also, "first" and "second" in your text, corresponds to the 1st and 2nd sequence in Pmb.F.priors
?
Yes, the sequences listed are those that I am trying to recover, and they should be the only sequences present in the samples. These sequences are the complete amplicon (after removing primers), they are not part of a longer sequenced region. I promise there are good reasons for why I am metabarcoding such a tiny region that I realize are not obvious. Yes, first and second correspond to the order in the Pmb.F.priors vector.
On Sat, May 9, 2020 at 4:17 PM Benjamin Callahan notifications@github.com wrote:
To clarify, the sequences linked here are the mock community sequences you are trying to recover? And, is it just these sequences being denoised, or are they part of a long sequenced region?
Also, "first" and "second" in your text, corresponds to the 1st and 2nd sequence in Pmb.F.priors?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/benjjneb/dada2/issues/1005#issuecomment-626229954, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTPX4GTKFPKXS4PSIJJG4DRQW24TANCNFSM4M4OKNWQ .
-- James Skelton Community Ecologist
webpage: poetsworm.com
email: skelto3@g skelto3@vt.edumail.com
That is... strange. When I use the dada2 alignment from within the R package, these sequences are all clearly distinguished from one another so what is going on?
unname(outer(Pmb.F.priors, Pmb.F.priors, nwhamming, vec=TRUE))
What version of the dada2 R package are you using? Could you share an example fastq file with me?
Using v‘1.14.0’ Would be willing to share a fastq privately. How may I do so?
On Mon, May 11, 2020 at 10:32 AM Benjamin Callahan notifications@github.com wrote:
That is... strange. When I use the dada2 alignment from within the R package, these sequences are all clearly distinguished from one another so what is going on?
unname(outer(Pmb.F.priors, Pmb.F.priors, nwhamming, vec=TRUE))
What version of the dada2 R package are you using? Could you share an example fastq file with me?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/benjjneb/dada2/issues/1005#issuecomment-626740137, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTPX4FTOTBYWE6DBKRRT63RRAD6RANCNFSM4M4OKNWQ .
-- James Skelton Community Ecologist
webpage: poetsworm.com
email: skelto3@g skelto3@vt.edumail.com
You can email me: benjamin DOT j DOT callahan AT gmail DOT com
Did we get this figured out over email?
Yes. Changing gap_penalty to 20 resolved the issue. Thank you for checking back.
On Thu, Jul 16, 2020, 4:27 PM Benjamin Callahan notifications@github.com wrote:
Did we get this figured out over email?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/benjjneb/dada2/issues/1005#issuecomment-659652904, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTPX4BR7NLN45U7DAS6O7LR35PBXANCNFSM4M4OKNWQ .
Hello,
I constructed simple mock communities comprised of short synthetic genes that vary only at a 6 bp region in the middle, combined in various known concentrations (example sequences of one such mock community pasted below). In every try so far, at least one of the known mock community members is not recovered after denoising, despite an abundance of perfect matches being present in the raw reads. I have included the known sequences as priors (forward and reverse compliments prior to merging), used pooling and no pooling, and tried selfconsist = T and F, each to no avail. In the below example mock communtiy, based on the known concentrations of the differnt variants going into the mock, it appears that the first and second sequences are being assigned to the same ASV, which is given the same sequence as the second mock member, and thus I recover zero perfect matches for the first mock member. This is particularly puzzling because the first mock member comprisies ~a third of the raw reads in some samples.
Is there anything else I can try to get DADA2 to descriminate among these similar sequences?
thank you.
Pmb.F.priors <- c("AGCTATTCTATTCCTAAATAATACATCCAACACTCCAACACTATTATTCCTAGCAACC", "AGCTATTCTATTCCTAAATAATACTCTCAACACTCCAACACTATTATTCCTAGCAACC", "AGCTATTCTATTCCTAAATAATAAGAGCAACACTCCAACACTATTATTCCTAGCAACC", "AGCTATTCTATTCCTAAATAATAATGACAACACTCCAACACTATTATTCCTAGCAACC", "AGCTATTCTATTCCTAAATAATATACACAACACTCCAACACTATTATTCCTAGCAACC")