Closed rnettles04 closed 6 years ago
The problem is that seqtab2
is not a sequence table (a matrix in R) but instead is just a character vector of sequences.
What version of the package are you using? packageVersion("dada2")
And when you are using justConcatenate=TRUE
, are you using multiple samples, or just one sample?
The dada2 version I have is 1.4.0
When I use justConcatenate = TRUE I am running multiple samples. Do you have any suggestions to fix this?
Thanks!
It's not clear to me how this could be happening, so I would need to reproduce the error on my end. Could you share a .rds object of mergers2
? I.e. the file created by:
saveRDS(mergers2, "path/to/save/mergers2.rds")
It's also possible that upgrading to the current release (1.6.0) might solve the issue.
I have updated my R and the dada2 package. It fixed the problem. Thanks!
I have a few other questions that I am hoping you can help me with. In the assignTaxonomy function in dada2, how is it assigning taxonomy? I understand it is using a kmer approach, but is there a similarity threshold involved (like the standard 97% similarity)? I believe the wording indicates that the kmer approach generates a consensus taxonomy. How does that work, and what is the similarity requirement for taxonomy assignment?
In addition, does dada2 have the ability to rarefy the resulting taxonomic sequence table?
assignTaxonmy
is just a reimplementation of the naive Bayesian classifier method described in this paper by Wang et al 2007: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1950982/
In addition, does dada2 have the ability to rarefy the resulting taxonomic sequence table?
dada2 doesn't perform rarefaction, but you can do that via other facilities in R, such as the phyloseq package (e.g. phyloseq::rarefy_even_depth
, as shown in this tutorial)
I have sequenced the fungal ITS2 region and am attempting to analyse it using dada2. Unfortunately, many of my reads do not overlap, so I would like to use justConcatenate = TRUE when running the mergePairs command. Because I ran into downstream errors with the concatenated data, I decided to see if I encountered any errors when I mergePairs() via overlap. The overlapping data worked, while the concatenated data continues to give me an error at the removeBimeraDenovo() step, by giving me the following error: "Error in isBimeraDenovoTable(unqs[[i]], ..., verbose = verbose) : Input must be a valid sequence table."
The sequence of steps I am using is:
When I run mergePairs() with concatenated versus overlapping methods I get outputs that seem to be formatted similarly. The only difference I see is sporadic spacing. However, when I run makeSequenceTable() with overlapping merging, it gives me differently formatted output than the concatenated data does. The overlapping output successfully runs through removeBimeraDenovo(), while the concatenated output gives me an error at that step.
Is there some way I can fix this formatting issue so I can use the concatenated data for downstream analysis?
I have included some of my code, and the relevant file formats for those interested. Thanks!
Format of the OVERLAPPING reads after merging and creating a sequencing table:
$R840 sequence 1TGGTTCTGGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAGTGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACATTGCGCCCCTTGGTATTCCATGGGGCATGCCTGTTCGAGCGTCATTTGTACCTTCAAGCTCTGCTTGGTGTTGGGTGTTTGTCTCGCCTCTGCGTGTAGACTCGCCTTAAAACAATTGGCAGCCGGCGTATTGATTTCGGAGCGCAGTACATCTCGCGCTTTGCACTCATAACGACGACGTCCAAAAGTNNNNNNNNNNAGCTCTGCTTGGTGTTGGGTGTTTGTCTCGCCTCTGCGTGTAGACTCGCCTTAAAACAATTGGCAGCCGGCGTATTGATTTCGGAGCGCAGTACATCTCGCGCTTTGCACTCATAACGACGACGTCCAAAAGTACATTTTTACACTCTTGACCTCGGATCAGGTAGGGATACCCGCTGA 2 T GGTTCTCGCATCGATGAAGAACGTAGCAAAGTGCGATAACTAGTGTGAATTGCATATTCAGTGAATCATCGAGTCTTTGAACGCAGCTTGCACTCTATGGTTTTTCTATAGAGTACGCCTGCTTCAGTATCATCACAAACCCACACATAACATTTGTTTATGTGGTGATGGGTCGCATCGCTGTTTTATTACAGTGAGCACCTAAAATGTGTGTGATTTTCTGTCTGGCTTGCTAGGCAGGAATATTACGCTGGTCTCAGGATCTTTTTTTTTGGTTNNNNNNNNNNTTTTTTTGGTTCGCCCAGGAAGTAAAGTACAAGAGTATAATCCAGTAACTTTCAAACTATGATCTGAAGTCAGGTGGGATTACCCGCTGA 3
ACTTTATTTATTAAGATAGCTAATCACTTAAGCATATCAATAAGCGG AGGCTTTAGACCAAGTCTCTGCTACCGTACCGTTTGTATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAANNNNNNNNNNAGCTCTGCTTGGTGTTGGGTGTTTGTCTCGCCTCTGCGTGTAGACTCGCCTTAAAACAATTGGCAGCCGGCGTATTGATTTCGGAGCGCAGTACATCTCGCGCTTTGCACTCATAACGACGACGTCCAAAAGTACATTTTTACACTCTTGACCTCGGATCAGGTAGGGATACCCGCTGA 4TGGTTCTCGCATCGATGAAGAACGTAGCAAAGTGCGATAACTAGTGTGAATTGCATATTCAGTGAATCATCGAGTCTTTGAACGCAGCTTGCACTCTATGGTTTTTCTATAGAGTACGCCTGCTTCAGTATCATCACAAACCCACACATAACATTTGTTTATGTGGTGATGGGTCGCATCGCTGTTTTATTACAGTGAGCACCTAAAATGTGTGTGATTTTCTGTCTGGCTTGCTAGGCAGGAATATTACGCTGGTCTCAGGATCTTTTTTTTTGGTTNNNNNNNNNNAGCTCTGCTTGGTGTTGGGTGTTTGTCTCGCCTCTGCGTGTAGACTCGCCTTAAAACAATTGGCAGCCGGCGTATTGATTTCGGAGCGCAGTACATCTCGCGCTTTGCACTCATAACGACGACGTCCAAAAGTACATTTTTACACTCTTGACCTCGGATCAGGTAGGGATACCCGCTGA abundance forward reverse nmatch nmismatch nindel prefer accept 1 128 1 1 0 0 0 NA TRUE 2 29 2 2 0 0 0 NA TRUE 3 17 3 1 0 0 0 NA TRUE 4 1 2 1 0 0 0 NA TRUE
TGGTTCTCGCATCGATGAAGAGCGCAGCGAAATGCGATACCTAGTGTGAATTGCAGCCATCGTGAATCATCGAGTTCTTGAACGCACATTGCGCCCTTCGGTATTCCGGAGGGCATGCCTGTTTGAGCGTCGTTTCCTTCTTGCGCCAGCGCAGAGTTGGAGGGGTGTAGTGCCCTTCTGAAAAGAAACGTGCGGGTGAAGCGAACTATATTGGGACGCTTGGCCGCCGAACTTTTAATAAGCTCGACCTCAAATCAGGTAGGAATACCCGCTGAACT Plate1Neg
0 TGGCTCTCGCATCGATGAAGAACGCAGCGAAATGTGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCATCTTGCACTCTCTGGTATTCCGGAGAGTATGTCTGTTTGAGTGTCATGAATTCTTCAACCCTCCCTTTTCTTAGTGAATTGAGAGGAGTTTGGATTCTGAGTGTTGCTCCTAACCCGAGCTCATTCGTAATATATTAGCATCCATATTCGACTTCGGATTGACTTGGCGTAATAGACTATTCGCTGAGGAATCTAACTTCGGTTAGAGCCGTTTTTGAACTAGGAAGCTGCTAATCTAGCTTAGTCTACCTTTAGATTAGATCTCAAATCAGATAGGATTACCCGCTGA Plate1Neg
0
TGGTTCTGGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGCGAATTGCAGAATTTCCGTGAGTCATCGAATCTTTGAACGCACATTGCGCCCACTGGTATTCCGGTGGGCATGCCTGTTCGAGCGTCATTATCCTCCCTCAAACCCCGGGTTTGGTGTTGGACCGAAGTTGTGTGAACAACTGGTCTAAAAGACAATGACGGCGTCCGTGGGACCCTCGGTGCAACGAGCTTTTAGGAGCACGCGCCGAGTTGCAAGGACCTTCCGGGCCGGTCTCCTTTTACTATTTTACAAGGTTGACCTCGGATCAGGTAGGAATACCCGCTGA Plate1Neg
0
Format of the CONCATENATED reads after merging and creating a sequencing table:
$R836 sequence 1 TGGTTCTGGCATCGATGAAGAACGCAGCGAAATGCGAT AAGTAGTGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACATTGCGCCCCTTGGTATTCCATGGGGCATGCCTGTTCGAGCGTCATTTGTACCTTCAAGCTCTGCTTGGTGTTGGGTGTTTGTCTCGCCTCTGCGTGTAGACTCGCCTTAAAACAATTGGCAGCCGGCGTATTGATTTCGGAGCGCAGTACATCTCGCGCTTTGCACTCATAACGACGACGTCCAAAAGTACATTTTTACACTCTTGACCTCGGATCAGGTAGGGATACCCGCTGA 4 TGGTTCTGGCATCGATGAAGAAC GCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACATTGCGCCCCGTGGTATTCCTCGGGGCATGCCTGTTCGAGCGTCATTTCAACCCATCAAGCTTCGGCTTGGTCTTGGGGCCTGCGATTCCGCAGCCTCTAAACCCAGTGGCGGTGCCCTCGAGCTCTGAGCGTAGTAATTCTTCTCGCTATAGGGTCTCGGTGGTATCTTGCCAATAACCCCCAATTTTTATAAGGTTGACCTCGGATCAGGTAGGGATACCCGCTGA 5 TGGTTCTCGCATCGATGAAGAACGTAGCAAAGTGCGATAACTAGTGTGAATTGCATA TTCAGTGAATCATCGAGTCTTTGAACGCAACTTGCGCTCATTGGTATTCCAATGAGCACGCCTGTTTCAGTATCAAAACAAACCCTCTATTCAACTTTTGTTGTATAGGATTATTGGGGGCCTCTCGATCTGTATAGATCTTGAAACCCTTGAAATTTACTAAGGCCTGAACTTGTTTAAATGCCTGAACTTTTTTTTAATATAAAGGAAAGCTCTTGTAATTGACTTTGATGGGGCCTCCCAAATAAATCTCTTTTAAATTTGATCTGAAATCAGGCGGGATTACCCGCTGA 7 TGGCTCTGGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCAT CGAATCTTTGAACGCACATTGCGCCCTCTGGTATTCCGGAGGGCATGCCTGTTTGAGTGTCATTAATTACTCACCAAACCTTTAAGTTTCAAACTTGTGGTGCGGTTGATGTTGGGGTCTGCAAGCTTTTATAACTTGCTTCCCTTAAAAGCATTGGTGCGGCCTGCTGCACGGTTTACACAACGTTCTAGGTTCATTCCAACTCGTTGCTTGACCGGGCAAACTTTTGTGCTGCACCTTAAACCCGTTTCTAGCTTTATGCTTTGAACCTCTTCAACTATTGACCTCAGATCAGGTAGGAATACGCGCTGA abundance forward reverse nmatch nmismatch nindel prefer accept 1 320 1 2 129 0 0 2 TRUE 4 91 2 1 128 0 0 2 TRUE 5 84 3 5 100 0 0 1 TRUE 7 30 8 7 97 0 0 2 TRUE
TGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCCTTTTGGTATTCCGAAAGGCATGCCTGTTTGAGTGTCATGAAATATCAACCCCTCCTGGTTTCTGATCAGTGTTGGGCTTGGACTTGGGTGTCTGCCAGTTCGCTGGCTCGCCTTAAAAGAGTTAGTGGTAATAACATCCACGGCTAAGACGTAATAAGTTTCGTCTGGTCAAGGTGTGACGANNNNNNNNNNATCGGTGGTTTTTTTGCCCTCGTTCCTCGTCGTGCGTGCCCCGTCGCCCGAACGCGCTCTTACGACCCTCACGCGTCGCCTCGGCGGCGCTCCCAACGCGACCCCAGGTCAGGCGGGACTACCCGCTGA
TGGTTCTCGCATCGATGAAGAACGTAGCAAAGTGCGATAACTAGTGTGAATTGCATATTCAGTGAATCATCGAGTCTTTGAACGCAACTTGCGCTCATTGGTATTCCAATGAGCACGCCTGTTTCAGTATCAAAACAAACCCTCTATCCAACTTTTGTTGTATAGGATTATTGGGGGCCTCTCGATCTGTATAGATCTTGAAATCCCTGAAATTTACTAAGGCCTGAACTTGTTTAAATGCCTGAACTTTTTTTTAATATAAAGGAAAGCTCTTGTAANNNNNNNNNNCTGCGTGCTTGAGGTATAATGCAAGTACGGTCGTTTTAGGTTTTACCAACTGCGGCTAATCTTTTTTATACTGAGCGTATTGGAACGTTATCGATAAGAAGAGAGCGTCTAGGCGAACAATGTTCTTAAAGTTTGACCTCAAATCAGGTAGGAGTACCCGCTGA
TGGTTCTGGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAGTGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACATTGCGCCCATTGGTATTCCAATGGGCATGCCTGTTCGAGCGTCATTTGTACCCTCAAGCTTTGCTTGGTGTTGGGCGTTTGTCTTTCGAGACTCGCCTTAAAAAGATTGGCAGCCGGCATACTGGTTTCGGAGCGCAGCACAAATTGCGGTCTGTTCCTGAATGTTGACGTCCATGAAGCCCCCATTATNNNNNNNNNNCTGCGTGCTTGAGGTATAATGCAAGTACGGTCGTTTTAGGTTTTACCAACTGCGGCTAATCTTTTTTATACTGAGCGTATTGGAACGTTATCGATAAGAAGAGAGCGTCTAGGCGAACAATGTTCTTAAAGTTTGACCTCAAATCAGGTAGGAGTACCCGCTGA
Error in isBimeraDenovoTable(unqs[[i]], ..., verbose = verbose) : Input must be a valid sequence table.