Closed sambaths closed 7 months ago
Hi @sambaths! No noobs questions, thanks for asking!
Shouldn't we add the mean here?
We don't have to do that here, as the group-specific effects in Bambi resemble random-effects in non-bayesian libraries. These effects are deflections around a common effect. So, a priori, they are centered around zero (we don't need the mean).
For example, let's take a model with a hierarchical intercept (or random effects) where we have $J$ groups.
This is the centered parametrization
$$ \begin{aligned} Y_i &\sim \text{Normal}(\mu_i, \sigma) \ \mui &= \alpha{j[i]} \ \alphaj &\sim \text{Normal}(\mu\alpha, \sigma\alpha) & \text{for all } j \in 1 \dots, J \ \mu\alpha &\sim \text{Normal} \ \sigma_\alpha &\sim \text{HalfNormal} \ \sigma &\sim \text{HalfNormal} \end{aligned} $$
coords = {"group": [1, 2, ... J]}
with pm.Model(coords=coords):
alpha_sigma = pm.HalfNormal("alpha_sigma")
alpha_mu = pm.Normal("alpha_mu")
alpha = pm.Normal("alpha", mu=alpha_mu, sigma=alpha_sigma, dims="group")
sigma = pm.HalfNormal("sigma")
mu = alpha[group_idx]
pm.Normal("y", mu=mu, sigma=sigma, observed=y)
And this is the non-centered one
$$ \begin{aligned} Y_i &\sim \text{Normal}(\mu_i, \sigma) \ \mui &= \alpha{j[i]} \ \alphaj &= \mu\alpha + \alphaj^* \sigma\alpha \ \alphaj^* &\sim \text{Normal}(0, 1) & \text{for all } j \in 1 \dots, J \ \mu\alpha &\sim \text{Normal} \ \sigma_\alpha &\sim \text{HalfNormal} \ \sigma &\sim \text{HalfNormal} \end{aligned} $$
coords = {"group": [1, 2, ... J]}
with pm.Model(coords=coords):
alpha_sigma = pm.HalfNormal("alpha_sigma")
alpha_mu = pm.Normal("alpha_mu")
alpha_star = pm.Normal("alpha_star", dims="group")
alpha = alpha_mu + alpha_star * alpha_sigma
sigma = pm.HalfNormal("sigma")
mu = alpha[group_idx]
pm.Normal("y", mu=mu, sigma=sigma, observed=y)
In Bambi, we use by default something that looks like the non-centered version. We have
$$ \alphaj = \alpha\text{baseline} + \alpha_{\text{deflection}, j} $$
The $\alpha\text{baseline}$ represents the $\mu\alpha$ above, and the $\alpha_{\text{deflection}, j}$ represents $\alphaj^* \sigma\alpha$. The latter is centered around zero, just like the group-specific effects in Bambi.
As for the second part of the question, about whether this only happens with Normal distributions, you're reading it correctly. We almost always use normal distributions for the deflections so that is what was implemented. If you use another distribution, the non-centered approach doesn't apply in Bambi (it could be modified, but i don't know how hard it would be)
Thanks @tomicapretto for that explanation. That clears the first part for me.
As for the second part of the question, about whether this only happens with Normal distributions, you're reading it correctly. We almost always use normal distributions for the deflections so that is what was implemented. If you use another distribution, the non-centered approach doesn't apply in Bambi (it could be modified, but i don't know how hard it would be)
on last part, I think I might have incorrectly worded my question above -
Suppose, For one of the parameters in my model, we have a prior that is LogNormal
, that would also get non-centered using offsets from the Normal
, right ?
For a LogNormal
, Shouldn't non-centered approach be looking something like this ? (based on the discussions here):
import theano.tensor as tt
# Centered
pm.LogNormal('val', mu=mu, sd=sd)
# NonCentered
val_raw = pm.Normal('val_raw', mu=0, sd=1)
val = tt.exp(val_raw * sd + mu)
I think so. But notice the code says "non-centered Lognormal on log space", and I emphasize the "log space" part.
I think so. But notice the code says "non-centered Lognormal on log space", and I emphasize the "log space" part.
Thanks @tomicapretto
Sorry, if this is a noob question. I'm still learning,
Shouldn't we add the mean here?
Before - https://github.com/bambinos/bambi/blob/ff685b75a0ef17a11c8399de6de0a5f9d388d231/bambi/backend/terms.py#L145-L148
After -
Also, this only happens when Prior is
Normal
(Assuming I read the codebase correctly), or is that not the case ? (I saw that this might be possible for other distributions also based on a discussion here)Again, Sorry for the noob questions.
Any and all replies is highly appreciated.