bambinos / bambi

BAyesian Model-Building Interface (Bambi) in Python.
https://bambinos.github.io/bambi/
MIT License
1.08k stars 124 forks source link

bambi.interpret.prediction fails for polynomial regression models #772

Closed tjburch closed 9 months ago

tjburch commented 10 months ago

When doing polynomial regression, the interpret.prediction method can't seem to handle polynomial terms within the formula. A MWE:

np.random.seed(0)
x1 = np.random.normal(size=100)
x2 = np.random.normal(size=100)
y = 2 + 3*x1 + 1.5*x1**2 + 2*x2 + np.random.normal(scale=1, size=100)
data = pd.DataFrame({'x1': x1, "x2": x2, 'y': y})
model = bmb.Model('y ~ poly(x1, 2) + x2', data)
results = model.fit()
bmb.interpret.predictions(model, results, "x2")

Gives the following error: AttributeError: 'LazyOperator' object has no attribute 'name'

The same thing also happens if you do y ~ I{x**2} + x as the formula.

tomicapretto commented 10 months ago

@tjburch thanks for reporting the issue. Just want to mention the error I get is AttributeError: 'LazyValue' object has no attribute 'name'.

This is related to formulae. For every argument passed in the function call, formulae creates a representation of that object (i.e. an instance of some Lazy* class). When the argument passed is a variable name (i.e. it's an instance of LazyVariable) we can access the name of that variable with the name attribute. Other objects, such as values, don't have an name attribute.

The problem is that Bambi is looking for the name of the argument value 2 in poly(x1, 2).

@GStechschulte I see you self-assigned this to you. The solution that I found is using the following

covariates.append([arg.name for arg in component.call.args if isinstance(arg, LazyVariable)])

in here

https://github.com/bambinos/bambi/blob/1a3cf8ad49add076c7daea6c29e7b09a267f2cb7/bambi/interpret/utils.py#L239-L240

where LazyVariable comes from from formulae.terms.call_resolver import LazyVariable.


I haven't tested this exhaustively and I think this solution won't work for cases such as

"y ~ fun1(x, fun2(z))"

as it will never reach the name z in the second argument. However, I'm not too worried about it for now, as I don't think there are many reasons to use Bambi that way.

GStechschulte commented 10 months ago

@tomicapretto thanks for the comment. I also get the same error as you. I was working on this over the weekend and came to roughly the same solution as you.

Once I add tests, I will open a PR 👍🏼

tjburch commented 10 months ago

Perfect, thanks!

For context, I've been doing some polynomial regression in brms lately so I thought I'd check out how it works in Bambi with formulae, started putting together some examples so hopefully when this is resolved I'll have a shiny new notebook to contribute.

GStechschulte commented 10 months ago

Sounds great @tjburch looking forward!