ChiLiubio / microeco

An R package for data analysis in microbial community ecology
GNU General Public License v3.0
204 stars 59 forks source link

Question about plot_scatterfit #417

Open meredithds opened 1 month ago

meredithds commented 1 month ago

Hello,

I used a mantel test on centered environmental data and the Bray-Curtis distances of the microbial community data. My first question is if the environmental data is used to create a Euclidean Distance matrix for the mantel test-- I could not identify where in the code this would take place.

I then wanted to use plot_scatterfit to visualize the significant variables by treatment group. When the vector for the variable of interest is created using Euclidean distance and then correlated with the Bray-Curtis dissimilarities it creates an xy table. I have 40 samples, yet when this xy table is created, each treatment has 45 points. I was wondering why this might be.

Thank you! Meredith Snyder

ChiLiubio commented 1 month ago

Hi. Yes. The environmental data is used to create a Euclidean Distance matrix for the mantel test. For the second one, how many samples are there in your each treatment. If 45 points are there, 10 samples and 4 treatment? If you need me to help checking the details, please provide more information or attach your data and steps (https://chiliubio.github.io/microeco_tutorial/notes.html#save-function). Thanks.

Chi

meredithds commented 1 month ago

Hi Chi,

Thank you for the quick response. Yes, I have 4 different biochar treatments with 10 samples each.

Both the dataset in the trans-env object and beta diversity matrix in the microtable object have the correct amount of samples and make a 40x40 matrix. The xy table results in 180 rows, 45 points per treatment. Wouldn't we expect 40 per treatment?

Here is my code:

Reading in the environmental variables

Environment <- data.frame(read.table(file='biocharenv.csv', sep=',', header=TRUE)) rownames(Environment)<-Environment$sample.name

add_data is used to add the environmental data

chem <- trans_env$new(dataset = dataset, add_data = Environment)

ecall=chem$plot_scatterfit( x = "EC", y = dataset$beta_diversity$bray[rownames(chem$data_env), rownames(chem$data_env)], type = "cor", group = "treatment", cor_method = "spearman", point_size = 3, line_se = FALSE , line_size = 1.5, label.x.npc = "left", label.y.npc = "top", y_axis_title = "Bray-Curtis Distance", x_axis_title = "Euclidean Distance of EC") + theme(axis.title = element_text(size = 17))+ scale_color_manual(values = wes_palette(n=4, name="Moonrise1")) ecall ecall+theme(axis.title=element_text(size=23), axis.text=element_text(size=15), axis.line=element_line(), legend.title = element_text(size=20), legend.text = element_text(size=20), legend.background = element_blank())+ labs(color='Treatment')

Thanks, Meredith

ChiLiubio commented 1 month ago

It should be fine, as the combination of 10 samples results in 45 points (C~10~^9^).