adw96 / DivNet

diversity estimation under ecological networks
83 stars 18 forks source link

Paired samples using betta_random/testDiversity #55

Closed kunstner closed 3 years ago

kunstner commented 4 years ago

Hi,

I'm still diving deeper into your wonderful packages for estimating alpha diversity of microbiome data.

I have paired data from a time course experiment (three time points) testing three different products (skin swabs). For each individual, all three products were samples at all three time points.

So far, I estimated Shannon diversity using divnet() and now I would like to test whether there are differences between time points for each product. Usually, I used non-parametric Wilcox tests to perform all comparisons needed (with p-value adjustment for multiple testing). I'm wondering whether I can use betta_random as well to do the statistical testing.

Here is my piece of code but I'm not sure whether the more traditional way (Wilcox test) is more suitable in this case:

dv_ps_cyp <- x %>% divnet(X = "Time", base = "ASV1", ncores = 10)

ds <- dv_testing_asv1$shannon %>% summary %>% add_column("SampleNames" = x %>% otu_table %>% sample_names, "Time" = x %>% sample_data %>% .$Time, "Product" = x %>% sample_data %>% .$Product) %>% arrange(Time, Product)

betta_random(chats = ds$estimate, ses = ds$error, X = model.matrix(~Time*Product, data = ds), groups=ds$SampleNames) %>% .$table

Thanks in advance for any advice!

Best, Axel

adw96 commented 4 years ago

Hi Axel! Thanks for your patience while I get back to you.

Could you please clarify for me what are the hypotheses that you want to test? Do you think you could write them (all) down?

Is it that you have 9 observations (3 time points for each of 3 subjects), and you want to compare

Indeed random effects can be useful with betta_random, but it depends if your swabs/products are a random effect or not, and the hypothesis that you care about, that determines what test to use.

Happy to help on this!

Amy

kunstner commented 4 years ago

Hi Amy,

thanks for your reply.

The design is as follows. We have 30 subjects (s1...s30). Each subject is sampled at three different locations (L0, L1, L2) using skin swabs, where L0 remains untreated and L1 and L2 are treated with two different substances. During the course of the experiment we take skin swabs at t0, t1, and t2. Basically, all locations should be more or less similar at t0 (baseline) and L0 should stay unchanged during the course of the experiment (diversity is similar at all three time points).

The idea is to compare each subject to its own baseline ('pairwise testing' L1 at t1 vs L1 at t0 and so on). Additionally, we would like to test against the baseline at each time point (e.g. L1 at t1 vs L0 at t1). Does this make sense?

So far I always used non-parametric paired tests to perform the analysis.

Many thanks for your help, Axel

fconstancias commented 4 years ago

Hi @kunstner, I have a similar design, were you able to use progress?

Thaks for sharing.

Best,

Florentin

kunstner commented 4 years ago

Hi @fconstancias,

I decided to use another test framework. I estimated diversity using DivNet and applied non-parametric test frameworks to test for differences (Kruskal-Wallis/Wilcoxon).

Best, Axel

paulinetrinh commented 3 years ago

Hi Axel @kunstner and Florentin @fconstancias!

I wanted to chime in on this issue about diversity hypothesis testing using paired samples! As Amy mentioned previously, random effects can be very useful with betta_random. Based on your study design it looks like you have some measurements that have been repeated on each individual at three different skin sites.

Repeated measurements on an individual are typically correlated (but not always) and a concern you might have is that you'd like to account for that within-person correlation in your hypothesis testing. Random effects can be really useful for these scenarios!

For example, if we had two groups of patients (a treatment and placebo group) who had repeated sampling of their gut microbes each month over 6 months. If we were interested in understanding the effect of treatment on Shannon diversity in the gut over time we can specify treatment as a fixed effect and subject as a random effect and allow each subject to have their own intercept.

It's difficult for me to figure out if what you're doing is appropriate for your study questions without a clearer understanding of your hypotheses. I can point you to some resources that might help in understanding how to answer your question of interest using fixed and/or random effects. Here & here.

Otherwise, I'd suggest checking in with your institution's biostatistician/statistician on how best to specify your model and answer your question of interest! Thanks for your time and interest in DivNet!

Cheers, Pauline