bambinos / bambi

BAyesian Model-Building Interface (Bambi) in Python.
https://bambinos.github.io/bambi/
MIT License
1.08k stars 124 forks source link

Internal refactor #804

Closed tomicapretto closed 4 months ago

tomicapretto commented 7 months ago

This started as a refactor with the goal to change how we name the parameters of the response distribution. So far, we named the parent parameter with f"{response_name}_mean" and all the other parameters with f"{response_name}_{param_name}". I no longer think appending _mean is a sensible approach as the parent parameter can be something different than the mean. And also, I realized prepending the name of the response resulted in very long and confusing names. For that reason, I decided it's better to just use the name of the parameter (i.e. mu for the mean in many families, p for probability of success, kappa, sigma, and so on).

But that is not the only change. Many other changes come with this PR. I summarize them below, and they will be added to the changelog. After this is merged, we can have a 0.14.0 release.

Summary of changes

Other important notes

EDIT Bayeux based inferences don't include the observed_data group. This is problematic if we want to do ppchecks. We need to add them. Done

EDIT: It also closes #814

tomicapretto commented 7 months ago

This is temporarily suspended because the var_names parameter in pm.sample() is not working as expected https://github.com/pymc-devs/pymc/issues/7258, and we need it for the modifications here.

tomicapretto commented 6 months ago

Now there's a new blocker which is https://github.com/pymc-devs/pymc/issues/7312

And I would also like to have https://github.com/pymc-devs/pymc/pull/7290 merged. Currently, the progressbar has many updates which slows down the sampler in jupyter notebooks.

tomicapretto commented 5 months ago

Everything should be fixed now. The pyproject.toml file is pointing to the dev version of PyMC. This should be good to go once PyMC releases a new version.

Now I'm going to re-run examples.

review-notebook-app[bot] commented 5 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

aloctavodia commented 5 months ago

I like the changes, but I find include_params and kind="params" ambiguous. Could we use something like linear_predictor, linear_term or something like that?

tomicapretto commented 5 months ago

include_params

Thanks for the review! I'm not sure I understand what is the suggestion (i.e. if you want argument(s) called linear_predictor, linear_term or those should be argument values). I also see the current names are not the best as it's as "params" can be many things in a model.

aloctavodia commented 5 months ago

I am saying that the name "params" is very vague.

tomicapretto commented 5 months ago

@GStechschulte thanks for the review! Do you have any ideas for the issue @aloctavodia mentioned? I agree the current approach is a bit vague. But I don't have lots of ideas right now :D

GStechschulte commented 5 months ago

@tomicapretto @aloctavodia apologies for the delayed response. This notification slipped through my GitHub 😞

Before, include_mean would compute the "posterior of the mean response" which technically may not always be a mean as you stated above. This computation is $g^{-1}(\eta)$ which relates the linear predictor to the response. Since it is a response parameter, but not always the mean, maybe include_response_params. It's a longer name, but less vague?

To be consistent in model.predict: