lnccbrown / HSSM

Development of HSSM package
Other
76 stars 11 forks source link

Error message for defining regression-based models & specification of priors for intercepts #404

Closed gingjehli closed 4 months ago

gingjehli commented 5 months ago

I've tried to use the formula below to define a regression-based model: "formula": "v ~ (1 + zSTN + zGPe)*group + (1 + zSTN + zGPe | group/participant_id)",

This produces an error message saying: AttributeError: 'Intercept' object has no attribute 'components'

I respecified the formula as follows: "formula": "v ~ 1 + group + (zSTN + zGPe)*group + (1 + zSTN + zGPe | group/participant_id)",

This second formula worked. It might be good if we add that somewhere into the tutorials.

Additional question: In STAN, the priors for an intercept apply after all predictors have been centered (i.e., the predictors are automatically centered if users don't deliberately specify it differently). Is that also the case in HSSM? I'm asking because in many cases, the value of y (dependent variable) when covariate x equals to 0 is not meaningful. It's often easier to think about the value when x equals x_mean. Therefore, placing a prior on the intercept after centering the predictors makes it often easier to specify a reasonable prior for the intercept. Though, I'm now wondering: if a user forgets to center the covariates, a prior on an intercept that assumes centered predictors might mess things up (i.e., push the actual values out of bounds).

AlexanderFengler commented 5 months ago

re first point: yeah we should have a few more example that specify more complicated regressions. We are using the formulae package for design matrix construction via Bambi, and it looks like even in their docs it's not easy to figure out how the first model will exactly differ from the second in terms of the matrix constructed (or why matrix construction fails in the first case).

re standardization: As far as I know (also just checked again), bambi will not standardize the predictors as a default, this is up to the user. We could add this as convenience functionality on the HSSM side.

If you look at the documentation https://bambinos.github.io/formulae/notebooks/getting_started.html#User-guide you can use the scale() function inside your regression to standardize your variables at the point of design matrix construction.