paul-buerkner / brms

brms R package for Bayesian generalized multivariate non-linear multilevel models using Stan
https://paul-buerkner.github.io/brms/
GNU General Public License v2.0
1.28k stars 181 forks source link

custom families and `mu` parameter #1596

Closed venpopov closed 4 months ago

venpopov commented 7 months ago

Hi Paul,

Currently brms requires that all families have a mu parameter, which is then interpreted as the parameter predicted by the formula with the response variable. For custom families, this then requires that one parameter is labeled mu, even if conceptually it has nothing to do with a mean. Everything works, but it would be more transparent if all parameters can be labeled freely, and this would help documenting complex models. Particularly, it is nice when the model parameter labels can match a corresponding publication.

Would it be a lot of effort to allow this? I undestand that labeling a parameter 'mu' allows brms to know which parameter is predicted by the main response formula. Perhaps a solution would be to add an additional argument to the custom_family function which specifies the "main" parameter that should be predicted by the response formula?


Edit:

More generally, it feels a bit unintuitive to have a formula of the type

y ~ condA par2 ~ condA + condB par3 ~ condC par4 ~ ....

where y is the response variable, but it is implicitely translated as "par1 ~ condA". This syntax obscures the fact that the response variable is a function of all predictors for all parameters. That conceptually this reflects:

y ~ SomeModelDistribution(par1,par2,par3,par4) par1 ~ ... (varies over some conditions) par2 ~ ...

This would be a bigger design change, and maybe it's something you don't like, but I could imagine in a future major update too have a syntax like:

brm(y ~ family(), par1 ~ condA, par2 ~ ...)

making the structure of models more transparent

venpopov commented 7 months ago

I see that in the description for brmsformula you have the following paragraph:

Parameter mu exists for every family and can be used as an alternative to specifying terms in formula. If both mu and formula are given, the right-hand side of formula is ignored. Accordingly, specifying terms on the right-hand side of both formula and mu at the same time is deprecated. *_In future versions, formula might be updated by mu._*

So I guess you are already thinking in this direction? If so, broadening this a bit by not requiring every model to have a mu parameter could be nice.

paul-buerkner commented 7 months ago

I agree, but not sure if this is worth the effort. Let's see if I have time for this at some point.

venpopov commented 7 months ago

Definitely not a high priority. Perhaps for v3.0.0 :)

I was thinking about this, because we are building a package for cognitive modeling that uses brms as an interface to construct the stan code, and for our models this type of specification is more intuitive. For now we wrote a custom formula class bmmformula, which uses this specification, and we translate it internally to a brms formula. This allows us to specify models like this, where the only terms that appear in the formula are model parameters:

library(bmm)
data <- OberauerLin_2017

formula <- bmf(c ~ 0 + set_size,
               a ~ 0 + set_size,
               s ~ 0 + set_size,
               kappa ~ 1)

model <- IMMfull(resp_err = "dev_rad",
                 nt_features = paste0("col_nt", 1:7),
                 nt_distances = paste0("dist_nt",1:7),
                 setsize = "set_size")

fit <- fit_model(formula, data, model)
venpopov commented 4 months ago

closing this to consolidate all related issues in #1660