clemente-lab / mmeds-meta

A database for storing and analyzing omics data
https://mmeds.org
2 stars 1 forks source link

Improved summaries: include text descriptions #325

Closed cleme closed 2 years ago

cleme commented 3 years ago

Is your feature request related to a problem? Please describe. Current summaries provide results with no description of what each of the methods represent. Include some standard language before each section describing the method being applied.

Describe the solution you'd like "Alpha diversity estimates the amount of microbial diversity within a sample or group..." "Beta diversity estimates the similarity between different samples based on microbial composition...PCoA then..."

@cleme to provide specific text for each section. Relatively easy fixture, only needs to include additional text in the summaries.

Linked to issue #322

cleme commented 3 years ago

Alpha diversity and rarefaction curves

Alpha diversity estimates the amount of microbial diversity present in a sample or group of samples. There are several measures that can be used for alpha diversity, including observed features, Shannon's diversity or Faith's phylogenetic diversity. Because diversity estimates depend on the total number of sequences assigned to each sample, rarefaction curves are constructed to show the relation between alpha diversity (on the vertical axis) and sequencing depth (on the horizontal axis). Curves that gradually plateau as sequencing depth increases suggest that additional sequencing effort would not substantially yield additional results in terms of currently not observed diversity; curves that continue to increase suggest additional sequencing effort might be required to saturate the estimate.

cleme commented 3 years ago

Beta diversity and PCoA plots

Beta diversity estimates how similar or dissimilar samples are based on their microbiome composition. Different to alpha diversity, which is estimated per sample, beta diversity is a distance that is calculated between pairs of samples. Samples that are similar to each other in their microbiome composition will have a low distance between them based on beta diversity, while those that are very different in their composition will have a large distance. Principal Coordinate Analysis (PCoA) is an ordination technique that visually represents the samples based on their beta diversity distances to facilitate the identification of clusters or gradients of samples. By default, the first three principal coordinates are shown in PCoA plots.

cleme commented 3 years ago

Taxonomy plots

Taxonomy plots represent the abundance of different taxa using stacked plots on a per-sample or per-group (averaged) basis. Data is normalized so that abundances per sample or per group add up to 100%. When using group-based taxonomy plots, it should be noted that only average abundances are shown per group and taxa: this can induce visual biases when a small number of samples in a group have significantly higher abundance of a given taxa compared to the rest of samples in the group, and give the (incorrect) impression that the group as a whole has high high abundance of the taxa.

adamcantor22 commented 3 years ago

Do you want to also have descriptions for the other new sections, 1. Demultiplex and 2. Table statistics?

cleme commented 3 years ago

I cannot think of anything right now beyond the title, so let's keep it simple.