Open rduesing opened 7 months ago
Hi @rduesing , I have been implementing features in bayer
. I will follow your points in my reply stick with the model example you provided with data
from mtcars
:
Interpcept
and 0/1
: you can simply pass them as covariates: bayer_linear( ..., covariates = c( 0, "Intercept", "cyl", "qsec" ), ... )
sigma
specification: you can now use it in the model: bayer_linear( ..., sigma_covariates = c( 0, "cyl", "carb" ), nu_link = "logm1", ... )
bayer_linear( ..., random_covariates = "gear", ... )
bayer_linear( ..., transformed = FALSE, ...)
and you will get results (and plots) with original metricslog
or logit
transformed variables and sigma
posterior
, I also added a extract_posterior( ..., plot_risk_word = 'posterior', ...)
option to allow for more variability based on users needextract_posterior( ..., var = "cyl", var_level = 6, var_level_sigma=TRUE, ...)
and the function will return this level values only. var_level_sigma
is to specify if you want b_cyl6
or b_sigma_cyl6
, in case you passed it as a variable for sigma
analysis$stats
in the returned object from extract_posterior()
with several values (mean, sd, se, median, q1, q3, min, max, MAP, ETI, HDI, ROPE) for each level of the variable of interest (or the specified level if var_level
was set. These are returned accordingly to transformed
as welltransformed
, if not please let me knowPlease note, to see changes you need to reinstall bayer
: remotes::install_github( 'cccnrc/bayer' )
I am still working on adapting few things, I would love to have your suggestion on plot text values as example. I kindly ask you to give it a try to the new version and let me know if you find any problem, bugs, and further enhancement!
Thanks again for reaching out!
Enrico
HI Enrico,
many thanks. I tried two models, my own and the mtcars example, but I couldnt get it running. I installed the latest version of bayer, on the help site random_covariates
is displayed, so that cant be the issue.
The code:
bayer.1 <- bayer_linear(data = data_s_long_nm,
outcome = "values",
robust = TRUE,
covariates = c("0", "Intercept", "CBLC"),
random_covariates = "Pat_ID",
sigma_covariates = c("0", "CBLC"),
cores = 4,
brm_iter = 6000)
dat <- mtcars
bayer.test <- bayer_linear(data = dat,
outcome = "mpg",
covariates = c( 0, "Intercept", "cyl", "carb"),
sigma_covariates = c( 0, "cyl", "carb" ),
random_covariates = "gear")
I get the following errors:
"outcome" passed is numeric: suitable for Linear regression: [1] "num: 164 -> mean: 65.95 (9.54) - median: 65.5 [60 - 72] - min: 37, max: 90 - NA: 0 (0%)" Fehler in riptw::get_formula(outcome = outcome, covariates = covariates, : unbenutztes Argument (random_covariates = random_covariates)
and
"outcome" passed is numeric: suitable for Linear regression: [1] "num: 32 -> mean: 20.09 (6.03) - median: 19.2 [15.43 - 22.8] - min: 10.4, max: 33.9 - NA: 0 (0%)" Fehler in riptw::get_formula(outcome = outcome, covariates = covariates, : unbenutztes Argument (random_covariates = random_covariates)
It translates "unused argument (random_covariates)..." I get the same message, even if I do not specify any random effects.
Hi @rduesing this was an error due to a change I needed to operate on get_formula()
from riptw that is imported in bayer
Can you try to reinstall riptw
too and see if this solves it?
remotes::install_github( "cccnrc/riptw" )
Hi Enrico,
it runs now, many thanks. I am still testing, but what I found with my real dataset: your default max_tree_depth is 15, which seems quite large to me and maybe unnecessary high. This might slow down the sampling (but I am not an expert here!). Additionally, I found that with a tree depth of 15, I had divergent chains and a very low ESS, as well as high Rhat values. When I reduced it to 12 (10 is default), everything worked fine.
Maybe you can find answers here: https://discourse.mc-stan.org/t/the-role-of-max-treedepth-in-no-u-turn/24155/3
One more question/request: is it possible to have random slopes and if "yes" already, how to formulate it? So far, we only used random intercepts.
I will get back to you, as soon as I have new insights. Thanks for your help!
Hi @rduesing thanks for pointing this out! I changed the default to 10 and you can always customize it through brm_max_treedepth
option
Regarding random slopes, can you provide a model (just a simple regression one, not a brms
) formula in order to see what is implementable yet?
Thanks again, Enrico
I dont know if this can "simply" be written as an regression model, since it will have several error terms. In the literature you will find different notations to write multilevel models. I try to use one that is understandable for me.
Let's start with a simple regression model, without any random effects. This would look like this for example:
yi = β0 + β1xi + ei
So, no random effect and the intercept and slope(s) are fixed/population effects for all participants. In R you can use the same formula with lm and brms:
y ~ 1 + x
or alternatively in brms also: y ~ 0 + Intercept + x
Now, if you have a random intercept, there will be a fixed/population part for the intercept, as well as a random part. You can write this as two equations:
yij = β0j + β1 x + eij
where β0j can be written as:
β0j = γ00 + U0j
where γ00 represents the population/fixed effect Intercept and U0j the group specific effect, i.e. the random part.
Some authors substitute it into one formula (whereas I myself find it simpler to think in several formulas...):
yij = γ00 + U0j + β1 x + eij
It is not possible simply write it as a regression in R. Brms uses the syntax of the lme4 package, which is quite intuitive
y ~ 1 + x + (1 | ID)
where the (1 | ID) is for the random Intercept
In a next step you not only allow for random Intercepts but also random slopes. Therefore, you need to specify which slopes (in case of several predictors) are allowed to vary within which clustering variable (there may be more than a two level model). This may look like this for example:
yij = β0j + β1j x + eij with
β0j = γ00 + U0j but also
β1j = γ10 + U1j
We can see that γ10 is he population/fixed effect of x on y, but that there is a cluster specific/random effect U1j, similar to the random intercept. Put together:
yij = γ00 + U0j + γ10 x + U1j x + eij
or
yij = γ00 + γ10 x + U0j + U1j x + eij
to make the fixed and random parts clearer.
In lme4 and brms this would look like
y ~ 1 + x + (x | ID)
Does this help?
I @rduesing I added the option random_slope
to allow for random slope as you suggested:
mod.2 <- bayer_linear(data = data,
outcome = "mpg",
covariates = c("cyl", 'hp'),
random_slope = 'cyl',
random_covariates = "gear",
robust = TRUE
)
> mod.2
# ...
Formula: mpg ~ cyl + hp + (cyl | gear)
# ...
Remember you need to update both bayer
and riptw
to the latest version:
remotes::install_github( "cccnrc/riptw" )
remotes::install_github( "cccnrc/bayer" )
Looking forward to hearing from you for everything!
Dear Enrico!
As requested, I will post my enhancement proposals here on github. Let's start with the generic example I gave. For example, I have 2 or more time points/groups and I want to predict sigma as well. How can I formulate a repeated measures model, which allows for heterogenity? My generic code was:
You "translated" it to the mtcars data set as follows:
Since my request specifically was concerned with nominal predictors, I would adjust you example and use "cyl" as predictor and reformulate the model for sigma, since this directly incorporates an enhancement request:
Which should give you something like this as output with default priors:
Now I tried it with bayer.
There were the first issues:
0 + Intercept + predictor
instead of1 + predictor
. This is because brms assumes for the latter centered/standardized predictors and will automatically adjust the priors for the intercept. With the former formula, you can put specific priors on the Intercept itself. It is not clear in bayer, what exactly will be formulated and transferred to brms and hence stan.... + (1|ID)
. There is no analogue in the mtcars example, but I think this is an important feature. (That's why I have chosen "time" as a generic example, which calls for the random intercept effect for ID/level 2 variable)The bayer output looks like this:
Of course the output looks very similar to the first model, if you look at the estimates for Intercept and slopes. But the difference is in the sigmas. The first model requested sigmas for each group separately, what is not possible (yet) in bayer. Additionally, for the first model, a log link was used to estimate the sigmas as compared to the identity link of the bayer model. This is automatically done by brms if you estimate sigma. Therefore an enhancement request:
If I draw your posterior with the inbuilt function and plot it:
I get the following figure:
Other requests concerning the posterior and posterior wrangling:
So, I really appreciate your effort and would like to help. My problem is not so much with the model formulation itself, but to "translate" the priors and to handle the posterior. I know what I want, but the scripting is time consuming and it would be a great help to automate some processes, which would make it also less prone for errors.
I am really looking forward to hear from you Rainer