Arcadia-Science / sourmashconsumr

Working with the outputs of sourmash in R
https://arcadia-science.github.io/sourmashconsumr/
Other
21 stars 3 forks source link

Alpha diversity estimation #70

Open gabridinosauro opened 1 year ago

gabridinosauro commented 1 year ago

Hello and thanks for the awesome tool.

I have a question, I see you efficiently introduced a method to plot and represent beta-diversity between samples (dissimiliarities).

I was thinking, what is the best way to represent alpha diversity? is the just the amount of tax detected by sourmash taxonomy? the total number of sketches, or the slope like in the tutorial? What is the most correct way to represent richness of a community? I think people would still love to see total number of species detected. But maybe a rarefaction curve with kmers should be reported too, supporting the result?

Thanks, sorry if the question, I am still a noob in metagenomics.

taylorreiter commented 1 year ago

Hi @mrgambero, this is a great question! I would recommend using a statistical method dedicated to alpha diversity estimation in microbiomes, something like the package Breakaway (https://github.com/adw96/breakaway). This notebook predates sourmashconsumr so it does some funky formatting things, but it demonstrates how to go from sourmash gather results to alpha diversity. https://github.com/taylorreiter/2022-sra-gather/blob/main/notebooks/20220321_species_richness_estimation.ipynb