florale / multilevelcoda

multilevelcoda R package for Bayesian compositional multilevel modelling
https://florale.github.io/multilevelcoda/
GNU General Public License v3.0
13 stars 2 forks source link

substitution() format #5

Closed LonelyHunter2 closed 4 months ago

LonelyHunter2 commented 5 months ago

Hello there!

While working with the multilevelCoDA package, I tried to understand how to use the substitution() function.

First of all, the example [https://florale.github.io/multilevelcoda/reference/substitution.html] do work. However, I'm getting error messages when trying to apply it to other formats.

For instance, if mv_test =

mv_test <- brmcoda(compilr = cilr_test,
              formula =   Stress ~ ilr1 + ilr2 + ilr3 + ilr4  + (1 | ID),
              cores = 8, seed = 123, backend = "cmdstanr")

then

substitution(object = mv_test,delta = 5)

I get the following error:

Error: The following variables can neither be found in 'data' nor in 'data2': 'ilr1', 'ilr2', 'ilr3', 'ilr4'

However, when I look at my data.frame, I do see it.

    Stress     ilr1  ilr2    ilr3    ilr4  ID
1        4  0.29104 1.202  0.6298  1.7095 185
2        7 -0.47590 1.580 -0.8348  0.9857 185
3        3 -0.49829 1.335  1.3223  2.6373 185
4        2 -0.31260 1.367 -0.0314  0.5533 185
5        8  0.20438 1.431 -0.6875  0.7313 185
6        8 -0.44414 1.156 -0.0955  0.6683 185```

Then, I get different errors when I try to apply the substitution with compositional data as outcome. 

```mv_test2 <- brmcoda(compilr = cilr_test,
              formula = mvbind(ilr1, ilr2, ilr3, ilr4) ~ 1 + (1 | ID),
              cores = 8, seed = 123, backend = "cmdstanr")

 substitution(object = mv_test2,delta = 5)
Error in y0[, h] : incorrect number of dimensions

And when I use my own data: [Where FB is a 3 level within subjects condition.

mv <- brmcoda(compilr = cilr,
              formula = mvbind(ilr1, ilr2, ilr3) ~ FB + (1 | Participant),
              cores = 8, seed = 123, backend = "cmdstanr")

substitution(object = mv,delta = 5)
Error in `[.data.frame`(comp0, , head(.SD, 1), by = eval(object$CompILR$idvar)) : 
  unused argument (by = eval(object$CompILR$idvar))

Any ideas on how I should se tup the data so that it works?

Thanks!

florale commented 5 months ago

Hi,

Thanks for your questions!

It would be helpful to check if the ilr variables are duplicated, by running model.frame(mv_test). So essentially, your data.frame shouldn't have any variables named starting with ilr. The ilr variables that are fed into brmcoda() are internally generated from the cilr_test output, not from the raw dataset.

substitution() doesn't work for models with compositional outcomes, as it examines the expected changes in an outcome that are associated with the reallocation across a set of compositional variables (expressed as ilr coordinates. It only works for models with compositional predictors.

LonelyHunter2 commented 5 months ago

If it doesn't work with the compositional outcome, the dimension error does make sense then since the outcome > 1.

Any suggestion on how to "post hoc" compositional outcome with your package if substitution is used for compositional predictor? More specifically, I'm looking at whether my predictors (Independent Variable) are linked to a shift in my compositional outcome (i.e. % contribution of four frequency bands that sum to 100 (or 1) [if the contribution of one band increases, the other decreases]). I have 3 predictors: 1 between (2 levels) and 2 within (2 and 3 levels) representing experimental conditions. Before finding this package, I was thinking of transforming my data (either ilr or clr) and then just parsing the transformed data into a linear mixed model (lm4) before using an ANOVA or MANOVA (still not sure on that last part). It's somewhat difficult to find information and help on what to do with a "complex" model like mine once the model (either Brms / Dirichlet) has been created.

Concerning the other problem, when running model.frame(mv_test), I do see the ilrs variables: image

Thanks !

florale commented 4 months ago

Currently, we don't have a function for that post hoc analysis. But it can be calculated using the posterior draws from the model. I do want to implement something like that for models with compositional outcomes, but I haven't got the time and resources to get into it.