Open OrsonMM opened 2 days ago
Dear Orson,
Thank you for your interest in MicrobiomeStat and for reaching out with your question about Linear Mixed Models (LMM). We appreciate your detailed description of your experimental design.
From your description, I can see you have:
While the model formula you suggested (y ~ time.var + group.var + time.var:group.var + (1|subject.var)) is generally appropriate for longitudinal microbiome data analysis, to better assist you, could you please specify which MicrobiomeStat function(s) you are using?
Each function might have slightly different implementations to accommodate the specific needs of alpha diversity, beta diversity, and differential abundance analyses.
Once you clarify which function(s) you're working with, I can provide more specific guidance about the model implementation.
Best regards
Hi Caffery Yang,
Thank's for rapid response,
I understand based on your response that each function generate a different ecuation model. I have more doubts in these functions:
alpha_time_diversity <- generate_alpha_trend_test_long(
data.obj = rarefy_data_genus,
alpha.name = c("shannon", "simpson", "observed_species", "chao1", "ace","pielou"),
depth = NULL,
time.var = "Time",
subject.var = "sample_treatment_time",
group.var = "Treat",
adj.vars = NULL
)
beta_diversity <- generate_beta_trend_test_long(
data.obj = rarefy_data_genus,
dist.obj = NULL,
subject.var = "sample_treatment_time", # random effect - I am not understand if is a slope or intercept ramdom
time.var = "Time", # Fixed effect
group.var = "Treat",
adj.vars = NULL,
dist.name = c("Jaccard")
)
beta_diversity_volatility <- generate_beta_volatility_test_long(
data.obj = rarefy_data_genus,
dist.obj = NULL,
subject.var = "sample_treatment_time",
time.var = "Time",
group.var = "Treat",
adj.vars = NULL,
dist.name = c("BC","Jaccard","UniFrac","JS")
)
3. DA
Here, I prefered used linda because I can put the ecuation.
(But I am not sure if its correct)
model_1 <- linda( feature.dat = genus_normalizated_data$feature.tab, meta.dat = genus_data$meta.dat, formula = '~ Time + Treat + Treat:Time + (1 | sample_treatment_time)', feature.dat.type = c('proportion'), prev.filter = 0.1, mean.abund.filter = 0, max.abund.filter = 0, is.winsor = TRUE, outlier.pct = 0.03, adaptive = TRUE, zero.handling = c('imputation'), pseudo.cnt = 0.5, corr.cut = 0.1, p.adj.method = "fdr", alpha = 0.05, n.cores = 20, verbose = TRUE )
Hi Orson,
Thank you for your detailed follow-up questions about the model equations in MicrobiomeStat. I'll explain how each function implements its statistical models:
alpha_time_diversity
call, the function implements a linear mixed effects model of the form:
alpha_diversity ~ Treat * Time + (1 + Time | Sample_Time)
This model includes:
beta_diversity
call, the function attempts two model structures in order of complexity:First tries:
Jaccard_distance ~ Treat * Time + (1 + Time | Sample_Time)
If that fails to converge, automatically simplifies to:
Jaccard_distance ~ Treat * Time + (1 | Sample_Time)
For your beta_diversity_volatility
call, this is actually a different type of analysis. It:
First calculates volatility (rate of change between consecutive timepoints) for each subject
Then fits a simple linear model: volatility ~ Treat
Differential Abundance Analysis (linda) Your formula is well-structured:
abundance ~ Time + Treat + Treat:Time + (1 | Sample_Time)
This model:
Some suggestions for your analysis:
For the alpha and beta trend analyses, the default inclusion of random slopes is appropriate for longitudinal data but may not converge with only 3 timepoints. Don't worry if this happens - the functions will automatically simplify to random intercepts.
Make sure your "Sample_Time" variable uniquely identifies samples that are measured repeatedly. Each independent sample should have a consistent identifier across its timepoints.
For linda, you could consider matching the alpha/beta diversity models by using:
~ Time + Treat + Treat:Time + (1 + Time | Sample_Time)
Though your current random intercept model is also perfectly valid.
Overall, your implementation looks appropriate for your experimental design (4 treatments, 3 timepoints, 5 replicates per treatment-timepoint combination). Let me know if you need any clarification about specific aspects of these models.
Best regards, Chen
PS: I'd like to encourage you to explore MicrobiomeStat's rich visualization capabilities to complement your statistical analyses.
I appreciated so much your help @cafferychen777
Dear team MicrobiomeStat,
I am appreciate very much your software contribution. I am new using Lineal Mixed models. Please can you suggest me If I am used my data correctly.
In my experiment, I have this variables:
asv variable : Taxonomical abundances of DADA2 output treat variable: 4 differents (A, B, C and D)
time variable : 3 differents time points (1,2,3) sample_treatment_time variable: 5 independent samples for each treatment and their respective replicates over time (60 samples in total).
My question is what is the asv community that are affected by Treat, Time or interaction of these Treat:Time.
I am enter my variables for model in MicrobiomeStat:
group.var = Treat subject.var = sample_treatment_time time.var = Time
Please can you explain me how is the ecuation form :
In the manual I am not sure if use the same model for alpha and beta diversity and for diferential abundance of AVS.
I understand that use : y ~ time.var + group.var + time.var : group.var + (1 | subject.var) is correct ??
Greats