bambinos / bambi

BAyesian Model-Building Interface (Bambi) in Python.
https://bambinos.github.io/bambi/
MIT License
1.06k stars 122 forks source link

Add priors to custom functions parameters #448

Open Acruve15 opened 2 years ago

Acruve15 commented 2 years ago

Hi bambinos,

First thanks for this great package, really useful and the backend part is awesome.

I would love to add priors to custom functions but it is impossible at the moment.

Let's imagine I have this function:

def my_func(serie, K, S, eps=10e-8): 
    return 1 / (1 + (K / (serie + eps)) ** S)

when K and S are known, this is trivial:

K, S = 0.5, 2

model_1 = bmb.Model("y ~ my_func(x1, K, S) + x2", data)

fitted_1 = model_1.fit(tune=1000, draws=2000, init="adapt_diag", random_seed=123)

When using priors on K and S, I would intuitively try:

K = Prior("Beta", alpha=3, beta=3)
S = Prior("Gamma", alpha=1.5, beta=0.5)

model_2 = bmb.Model("y ~ my_func(x1, K, S) + x2", data)

But this gives me the following error:

TypeError: Cannot broadcast np.ndarray with operand of type <class 'bambi.priors.prior.Prior'>

Do you have any workarounds to make this work? If not, do you think it could be an interesting improvement to the library? I have the feeling it would ask to re-think Term's properties.

Many thanks in advance for your answer :)


To generate the data:

size = 200
a= 1
b= 2
c= 4

rng = np.random.default_rng(123)
x = pd.DataFrame({
    'intercept': [1]*size,
    'x1': np.linspace(0, 1, size),
    'x2': np.linspace(0, 2, size),
})

x['x3'] = my_func(x['x1'].values, K=0.5, S=2)

# y = a + b*x2 + c*x3
true_regression_line = a+ b* x['x2'] + c* x['x3']
# add noise
y = true_regression_line + rng.normal(scale=0.5, size=size)
y.name = "y"

data = pd.concat([y, x], axis=1)
tomicapretto commented 2 years ago

Hi @Acruve15!

I had this in mind for a while, but still didn't have time or couldn't figure it out.

Let me add more context.

Bambi works with Generalized Linear Mixed Models (GLMMs). The linear part refers to linear in the parameters. Many things that look non-linear are actually linear in some parameter space, and consequently, they are linear models. Splines are one example of that.

The model you use is not linear in the parameters when either K or S is unknown (i.e. are considered parameters and not constants).

Then, there are a couple of things we need to think about very well: