bambinos / bambi

BAyesian Model-Building Interface (Bambi) in Python.
https://bambinos.github.io/bambi/
MIT License
1.06k stars 121 forks source link

Bambi slow one model but not the other even though performance should be identical #567

Open canyon289 opened 2 years ago

canyon289 commented 2 years ago

For some reason Bambi is slow on the second model but not the first. Expectation is run times are similar. The large discrepancy is quite confusing

Full notebook attached

model_age_videogames = bmb.Model("converted['Yes'] ~ age + likes_videogames", df, link="logit", family="bernoulli")
idata_age_videogames = model_age_videogames.fit(draws=2000, target_accept=0.85, random_seed=SEED)
model_age_videogames = bmb.Model("converted['Yes'] ~ age + likes_videogames+0", df, link="logit", family="bernoulli")
idata_age_videogames = model_age_videogames.fit(draws=2000, target_accept=0.85, random_seed=SEED)

https://gist.github.com/canyon289/5e6dcc218788adcb4712a34f10be377e

tomicapretto commented 2 years ago

Thanks for the detailed gist :)

kelleyjbrady commented 1 year ago

I am also facing this issue. Running the exact same model run 'back to back' gives the same outcome as what you have observed -- the first model converges, the second does not and takes forever to sample only to give different results for all of the values in arviz.summary(trace) once it has finished sampling. You don't show the result of arviz.summary(trace) in your example, but I bet the R_hat's on the second model are >>1, indicating serious sampling issues. I feel like pymc must be doing some sort of unintended warm start when Babmi is used. I do not see any issues when I use pymc alone is used.