benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution
http://benjjneb.github.io/dada2/
GNU Lesser General Public License v3.0
473 stars 143 forks source link

Comparing 454 and illumina sequencing #875

Closed braddmg closed 5 years ago

braddmg commented 5 years ago

Hi everyone, I have samples from many years ago that were sequenced with 454 sequencing (reads of 150pb) and more recently samples that were sequenced with illumina (read of 250pb). I am confused about how I should analyze the data, because I need to compare these samples. Its possible to compare samples from different types of sequencing? Thank you very much

benjjneb commented 5 years ago

Did you use the same primers? Are the sequenced regions overlapping?

If so, you can cut down to the overlap region and merge at the ASV level. If not, you'll have to do taxonomic assignment on each dataset and compare at that level.

braddmg commented 5 years ago

Hi, we used the same primers but I really dont now how to cut down the overlap region. I think that the best option (to me) is run dada2 and make the taxonomic assignment on each dataset and compare at that level. If I do that, I going to have differents taxonomy tables, I suppose that I need to combine those taxonomy tables?

benjjneb commented 5 years ago

Yeah that probably makes the most sense, or is at least the most straightforward.

You will need to combine the taxonomy tables. One way would be to import both into phyloseq, and then make use of the merge_phyloseq function. However you can do it yourself fairly easily if you are reasonably familiar with R.

braddmg commented 5 years ago

Hi again, I made what we talked about yesterday, but when I tried to merge both phyloseq objects I got these error "Error in FUN(X[[i]], ...) : one tree has a different number of tips". I guess that is because I need to create the tree after merging the otu and taxonomy tables but I really dont know how to do it without phyloseq object. Its possible to create the tree from the phyloseq object? Thank you

braddmg commented 5 years ago

I was thinking a little bit more about this interesting topic. I think that because the particuliar clustering that use dada2, its impossible to find the same ASV in the samples from 454 and the samples from Illumina. This is going to have a big impact in beta diversity, because if my 2 datasets dont share any ASV the analysis is going to indicate that those samples are very different each other. That just will be possibly different if I use Unifrac because the index would consider the phylogenetic relationship. I don't know if that will be really good. On the other hand, in alfa diversity and exploratory analysis to search specific phyla, families or even genera that should not be a problem because we assume that the taxonomy assignment is ok. What do you think about that? I am really new in this topic and I really appreciate the help of a big person in this area like you. I think that the better option here is make another sampling (if its possible, I dont know) and sequence with the same method.

benjjneb commented 5 years ago

Hi again, I made what we talked about yesterday, but when I tried to merge both phyloseq objects I got these error "Error in FUN(X[[i]], ...) : one tree has a different number of tips". I guess that is because I need to create the tree after merging the otu and taxonomy tables but I really dont know how to do it without phyloseq object. Its possible to create the tree from the phyloseq object? Thank you

Yes you'd want to make the tree on the merged dataset. However, if you are merging based on taxonomy, it's not obvious that you will have a very effective tree via the typical alignment/phylogeny route. You may just want to use the taxonomic tree.

benjjneb commented 5 years ago

I think that the better option here is make another sampling (if its possible, I dont know) and sequence with the same method.

This is a good option, that would also help to reduce methodological batch effects between the studies. But it is possible to merge datasets even if they used different sequencing techs, by the approach of assigning taxonomy based on a common reference, collapsing the data into those named taxa, and then merging the "taxa tables" (not the ASV or OTU tables).