joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
581 stars 187 forks source link

Extremely wide variation of alpha diversity estimates among different studies #1747

Open migfejo opened 5 months ago

migfejo commented 5 months ago

Hi.

I have been working on gut microbiome data in different projects. I use dada2 for data processing and then phyloseq for formatting data and make some estimations (as alpha or beta diversity).

I apply the estimate_richness function to the original phyloseq object, i. e., before aggregating abundances by some taxa level, or transformation into composictonal data. The problem is that, unexpectedly, me estimations of chao1 varies a lot among different projects that I am working with. For example, in one project of 55 samples I obtain a chao1 mean of 2228 (+- 579 SE), while in another project of 21 samples, I obtained a chao1 mean of 8264 (+- 1849 SE). Both projects are so similar, consisting on gut microbiota. Moreover, I have read about other projects with chao1 oscillating around 300 and 500. I see too much differences here, and I don`t know what may be the reason for this.

Furthermore, shannon indexes I obtained in this projects oscillate around 7 and 8. But reading literature, this values seem to be so high for gut microbiome communities, which use to oscillate around 4 or 5, and my data comes from cystic fibrosis patients and elderly, and this populations should show even smaller shannon values. I tried to calculate shannon index after aggregating abundance by species taxa, but then estimations results on around 2, too small.

Anybody knows what may be happening here?