cafferychen777 / MicrobiomeStat

Track, Analyze, Visualize: Unravel Your Microbiome's Temporal Pattern with MicrobiomeStat
https://www.microbiomestat.wiki/
31 stars 5 forks source link

Request to understand the LMM models in alpha, beta diversity and Differencial abundance. #67

Open OrsonMM opened 2 days ago

OrsonMM commented 2 days ago

Dear team MicrobiomeStat,

I am appreciate very much your software contribution. I am new using Lineal Mixed models. Please can you suggest me If I am used my data correctly.

In my experiment, I have this variables:

asv variable : Taxonomical abundances of DADA2 output treat variable: 4 differents (A, B, C and D)
time variable : 3 differents time points (1,2,3) sample_treatment_time variable: 5 independent samples for each treatment and their respective replicates over time (60 samples in total).

image

My question is what is the asv community that are affected by Treat, Time or interaction of these Treat:Time.

I am enter my variables for model in MicrobiomeStat:

group.var = Treat subject.var = sample_treatment_time time.var = Time

Please can you explain me how is the ecuation form :

In the manual I am not sure if use the same model for alpha and beta diversity and for diferential abundance of AVS.

I understand that use : y ~ time.var + group.var + time.var : group.var + (1 | subject.var) is correct ??

Greats

cafferychen777 commented 2 days ago

Dear Orson,

Thank you for your interest in MicrobiomeStat and for reaching out with your question about Linear Mixed Models (LMM). We appreciate your detailed description of your experimental design.

From your description, I can see you have:

While the model formula you suggested (y ~ time.var + group.var + time.var:group.var + (1|subject.var)) is generally appropriate for longitudinal microbiome data analysis, to better assist you, could you please specify which MicrobiomeStat function(s) you are using?

Each function might have slightly different implementations to accommodate the specific needs of alpha diversity, beta diversity, and differential abundance analyses.

Once you clarify which function(s) you're working with, I can provide more specific guidance about the model implementation.

Best regards

OrsonMM commented 2 days ago

Hi Caffery Yang,

Thank's for rapid response,

I understand based on your response that each function generate a different ecuation model. I have more doubts in these functions:

  1. alpha diversity
alpha_time_diversity <- generate_alpha_trend_test_long(
  data.obj = rarefy_data_genus,
  alpha.name = c("shannon", "simpson", "observed_species", "chao1", "ace","pielou"),
  depth = NULL,
  time.var = "Time",
  subject.var = "sample_treatment_time",
  group.var = "Treat",
  adj.vars = NULL
  )
  1. Beta diversity
    beta_diversity <- generate_beta_trend_test_long(
    data.obj = rarefy_data_genus,
    dist.obj = NULL,
    subject.var = "sample_treatment_time",   # random effect - I am not understand if is a slope or intercept ramdom  
    time.var = "Time", # Fixed effect 
    group.var = "Treat",
    adj.vars = NULL,
    dist.name = c("Jaccard")
    )
    
    beta_diversity_volatility <- generate_beta_volatility_test_long(
    data.obj = rarefy_data_genus,
    dist.obj = NULL,
    subject.var = "sample_treatment_time",
    time.var = "Time",
    group.var = "Treat",
    adj.vars = NULL,
    dist.name = c("BC","Jaccard","UniFrac","JS")
    )
3. DA

Here, I prefered used linda because I can put the ecuation. 
(But I am not sure if its correct)

model_1 <- linda( feature.dat = genus_normalizated_data$feature.tab, meta.dat = genus_data$meta.dat, formula = '~ Time + Treat + Treat:Time + (1 | sample_treatment_time)', feature.dat.type = c('proportion'), prev.filter = 0.1, mean.abund.filter = 0, max.abund.filter = 0, is.winsor = TRUE, outlier.pct = 0.03, adaptive = TRUE, zero.handling = c('imputation'), pseudo.cnt = 0.5, corr.cut = 0.1, p.adj.method = "fdr", alpha = 0.05, n.cores = 20, verbose = TRUE )

cafferychen777 commented 2 days ago

Hi Orson,

Thank you for your detailed follow-up questions about the model equations in MicrobiomeStat. I'll explain how each function implements its statistical models:

  1. Alpha Diversity Analysis For your alpha_time_diversity call, the function implements a linear mixed effects model of the form:
    alpha_diversity ~ Treat * Time + (1 + Time | Sample_Time)

This model includes:

  1. Beta Diversity Analysis For your beta_diversity call, the function attempts two model structures in order of complexity:

First tries:

Jaccard_distance ~ Treat * Time + (1 + Time | Sample_Time)

If that fails to converge, automatically simplifies to:

Jaccard_distance ~ Treat * Time + (1 | Sample_Time)

For your beta_diversity_volatility call, this is actually a different type of analysis. It:

  1. First calculates volatility (rate of change between consecutive timepoints) for each subject

  2. Then fits a simple linear model: volatility ~ Treat

  3. Differential Abundance Analysis (linda) Your formula is well-structured:

    abundance ~ Time + Treat + Treat:Time + (1 | Sample_Time)

This model:

Some suggestions for your analysis:

  1. For the alpha and beta trend analyses, the default inclusion of random slopes is appropriate for longitudinal data but may not converge with only 3 timepoints. Don't worry if this happens - the functions will automatically simplify to random intercepts.

  2. Make sure your "Sample_Time" variable uniquely identifies samples that are measured repeatedly. Each independent sample should have a consistent identifier across its timepoints.

  3. For linda, you could consider matching the alpha/beta diversity models by using:

    ~ Time + Treat + Treat:Time + (1 + Time | Sample_Time)

    Though your current random intercept model is also perfectly valid.

Overall, your implementation looks appropriate for your experimental design (4 treatments, 3 timepoints, 5 replicates per treatment-timepoint combination). Let me know if you need any clarification about specific aspects of these models.

Best regards, Chen

cafferychen777 commented 2 days ago

PS: I'd like to encourage you to explore MicrobiomeStat's rich visualization capabilities to complement your statistical analyses.

OrsonMM commented 2 days ago

I appreciated so much your help @cafferychen777