ChiLiubio / microeco

An R package for data analysis in microbial community ecology
GNU General Public License v3.0
204 stars 59 forks source link

warning from ggplot, NMDS, shape in ordination. #251

Open marwa38 opened 1 year ago

marwa38 commented 1 year ago

Just reporting a warning out of ggplot

image That might be because the trans_beta$new is grouped by Region while the plot_shape is grouped by Regime?

Thanks Mara

marwa38 commented 1 year ago

image is it NMDS or MDS? I used NMDS in ordination but the plot shows it is MDS!

image

marwa38 commented 1 year ago

Not sure if you could possibly add control over the shape we could choose? triangle as default for the second group is not the best for visualization as it is a bit similar to circles as both are filled. filled circles and + is a suggestion to be used together or we control what shape we want. image

ChiLiubio commented 1 year ago

Hi. Please use shape_values parameter in plot_ordination function. It is easier to add shape_values = c(16, 3).

ChiLiubio commented 1 year ago

In the axis, MDS has the same meaning with NMDS. If you want to use 'NMDS', please try to use xlab function of ggplot2 to change it.

marwa38 commented 1 year ago

Thanks so much

marwa38 commented 1 year ago

MANOVA is a parametric test, unlike PERMANOVA (adonis.pair() for pairwise). I got some axes that are normally distributed that I beleive will run MANOVA on. ANOSIM is also one of the statistical methods used in microbiota research. image from: https://bioinformatics.ccr.cancer.gov/docs/qiime2/Lesson6/

Thanks M

marwa38 commented 1 year ago

image That might be because the trans_beta$new is grouped by Region while the plot_shape is grouped by Regime?

ChiLiubio commented 1 year ago

Thanks. The anosim will be implemented in the trans_beta class. The manova here equals to perMANOVA. Please also see the document with ?trans_beta

ChiLiubio commented 1 year ago

The warning message comes from the shape aesthetics. Both color and shape are used, the ggplot2 package can only use the groups of plot_color, so it drops the groups of plot_shape automatically. In this case, the function can not add ellipse for each possible combination of groups between plot_color and plot_shape.

ChiLiubio commented 1 year ago

Hi. In the updated github version, ANOSIM has been implemented in the new created cal_anosim function of trans_beta class. If you need it, please update your package from github and try again.

library(microeco)
t1 <- trans_beta$new(dataset = dataset, group = "Type", measure = "bray")
t1$cal_anosim()
t1$res_anosim

# group can also be provied in the function
t1 <- trans_beta$new(dataset = dataset,  measure = "bray")
t1$cal_anosim(group = "Group")
t1$res_anosim

# paired test between two groups
library(microeco)
t1 <- trans_beta$new(dataset = dataset, group = "Type", measure = "bray")
t1$cal_anosim(paired = TRUE)
head(t1$res_anosim)
marwa38 commented 1 year ago

Thanks. The anosim will be implemented in the trans_beta class. The manova here equals to perMANOVA. Please also see the document with ?trans_beta

PERMANOVA is non-parametric while MANOVA is parametric so they are different. Some researchers try to transform the data to be normally distributed so that they could use a powerful statistical test (i.e. parametric). I checked trans_beta() before and I know it is PERMANOVA, not MANOVA. Another point is that some researchers run the statistical analysis on a few axes after checking the scree plot (example 2 or 3 axes rather than running on all PCoA/MDS/PCA/NMDS axes).

example

# Extract eigenvalues
eigenvalues <- out.brayPCoA.intes.log$values$Eigenvalues

# Calculate cumulative proportion of variance explained
prop_var <- cumsum(eigenvalues) / sum(eigenvalues)

## Create scree plot
# Create an empty plot
plot(NULL, xlim = c(1, length(eigenvalues)), ylim = range(eigenvalues), xlab = "Component", ylab = "Eigenvalue", main = "Scree Plot")

# Add the scree plot
lines(1:length(eigenvalues), eigenvalues, type = "b")

# Identify the appropriate number of components to retain
threshold <- 1  # For Kaiser's Criterion
# threshold <- 0.1  # For the Elbow Rule

abline(h = threshold, col = "red")  # Add a horizontal line for the threshold

# Identify the component number where the eigenvalues cross the threshold
num_components <- which(eigenvalues < threshold)[1] - 1

# Print the identified number of components
cat("Number of components to retain:", num_components, "\n")
ChiLiubio commented 1 year ago

Thanks. I got it. I will consider optimizing this part to provide more options in the future version.