Help strata argument in adonis function

qducarmon commented 3 years ago

Dear,

For my study, I have repeated measurements and I want to account for this in the adonis function using the strata() function. I have either 2, 3 or 4 samples (16S based data) per individual (total of 27 individuals) and want to investigate whether there is an overall difference in individuals based on their disease status, the Ever_Colonized group (see below) (yes/no), while accounting for the repeated measurements. So I thought I could do this using the following code:

adonis(bray_matrix~Ever_Colonized,data=metadata_perm,permutations=999,strata="Residentnumber")

However, I get the following error:

Error in check(sn, control = control, quietly = quietly) : Number of observations and length of Block 'strata' do not match.

I am not sure what caused this, but by what I could find online, it is likely due to the fact that adonis requires purely balanced design (same nr samples/individual). Is this correct? And if so, what is the solution for this? In any case I am wondering what is the best approach, as the number of samples/individual indicates the number of permutations will be quite low

Best, Quinten

jarioksa commented 3 years ago

You should not quote Residentnumber, but just write strata = metadata_perm$Residentnumber. You need the name of the data frame as strata cannot be found automatically from data.

qducarmon commented 3 years ago

Dear Jari,

Thanks for the extremely fast answer! When I implement your suggestion, it indeed works, so that's great. However, I still have a question about the resulting output. When running adonis(bray_matrix~Ever_Colonized,data=metadata_perm,permutations=999,strata=metadata_perm$Residentnumber) I get the following output with the Ever_Colonized variable being completely non-significant

When I run the code without the strata argument the Ever_Colonized variable is highly significant (adonis(bray_matrix~Ever_Colonized,data=metadata_perm,permutations=999)) -->

Does this basically mean that when correcting for intra-individual variation (the resident), there is absolutely no effect of the Ever_Colonized variable (yes/no) on overall microbiota composition?

Kind regards and thanks a lot for your help, Quinten

jarioksa commented 3 years ago

Basically it can be regarded as meaning that. The strata argument restricts permutations within blocks of strata, and in principle you study the effect of Ever_Colonized within blocks only. However, this will treat your strata as fixed and non-exchangeable: analysis regarding Residenttime as a random effect can potentially give somewhat different view, but random effects cannot be analysed in adonis (I don't have a clue how to implement them). So take this as a limitation of the method.

qducarmon commented 3 years ago

Thanks for confirming that. So if I understand correctly what you say, the effect of Ever_Colonized is studied within each resident? The only thing I am not sure about in that case is the following: Each resident only has 'yes' or 'no' as ever_colonized, so it is the same outcome across all time points within each resident. So then permutations don't make that much sense right, since each resident only has 1 outcome and there is never a mix of 'yes' or 'no' within a resident? Or do I misunderstand it now and I am overcomplicating stuff? Just want to make sure that I understand it well.

Alternatively, I have another variable where there is a mix of 'yes' and 'no' based on sample-based colonization (the ever_colonization variable was a yes if an individual was positive on at least 1 time point, and negative if never positive), so here there can be variation in the outcome within each resident (some samples of a resident can be positive, some can be negative). So if what I understood above is correct, it would perhaps make more sense to use strata in this case? Hope my questions are clear, if not, please let me know!

Kind regards, Quinten

gabridinosauro commented 1 year ago

Hello,

I am also interested in this problem. An experimental design with measurement on pre-post treatment on a subject/patient/plot, is a very common scenario in ecology, but the correct implementation in adonis or adonis2, still remains a a mystery, subject to many discussions on various forums on the internet. Some claiming that subject should be used in the strata argument, some as a predictor. It would be very nice, if the developers could comment on this. What is the right way? Strata or not?

Thanks! Gabri

gavinsimpson commented 1 year ago

All of this would be clearer if people use dbrda() instead of adonis2() - then we could use Condition() and think about doing a design-based or model-based permutation test directly using the functionality in permute that vegan already hooks into.

gabridinosauro commented 1 year ago

@gavinsimpson Hi Gavin, thanks a lot for your comment (and for making me discovering your blog!).

So you are suggesting that condition can be used as a sort of random variable in dbRDA? So for example, I have 30 samples coming from 15 patients. Each of them was treated with antibiotics. I have a sample before the treatment and a sample after the treatment for each one.

The formula, using dbRDA would be the following? test = dbrda(distance.mat ~ pre_post_abx + Condition(patient)) Is that correct?

In case I would have more than 1 explanatory variable, then I could also check the significance of each variable with anova(test, by = "margin") Is there a way also to extract single R2 values somehow? Thanks in advance! Gabri

gabridinosauro commented 1 year ago

Hello everyone,

I think I figured it out but using adonis2 I am using the following code:

permu_scheme <- how(within = Within(type = "free"),
              plots = Plots(type = "none"),
              blocks = antibio_md$patient, ### Patient_variable
              nperm = 99999,
              observed = TRUE)
perm = adonis2(dist.mat ~ pre_post_abx, permutations = permu_scheme, data = antibio_md)

It appears to me the best way as I am restricting the permutations within patients. The code above with dbRDA gave me signifcant p-values, but I think it does permutations also between plots. I have not really understood how to use this Condition() function as I have not been able to find any help so far.

Cheers, Gabri

gabridinosauro commented 1 year ago

Following up, For example, I have 30 samples coming from 15 patients. Each of them was treated with antibiotics. I have a sample before the treatment and a sample after the treatment for each one. I wanna know if the community changes after antibiotics. I do this:

permu_scheme <- how(within = Within(type = "free"),
              plots = Plots(type = "none"),
              blocks = antibio_md$patient, ### Patient_variable
              nperm = 99999,
              observed = TRUE)

dbr_da = dbrda(dist.mat ~ dpre_post_abx + Condition(patient), data = antibio_md)
anova(dbr_da, by="margin", permutations = CTRL.t)
RsquareAdj(dbr_da)

Should be the right code with dbrda, correct? I see also here on github many issue requests for help with repeated measures analysis with vegan. It would be very very helpful for vegan user to have some more concrete answer. These things are not immediate to understand if you do not work daily with them. Thanks very much!! Gabri

gabridinosauro commented 1 year ago

Hi @gavinsimpson . Did you have the chance to look at this and confirm that this is correct? I am still not 100 percent sure.

Thanks in advance and also thanks for your awesome work!

vegandevs / vegan

Help strata argument in adonis function #384