kearnz / autoimpute

Python package for Imputation Methods
MIT License
237 stars 19 forks source link

Bug Using Bayesian Linear Regression and Bayesian Logistic Regression in MiceImputer #60

Closed kearnz closed 3 years ago

kearnz commented 3 years ago

There's a bug in how the MiceImputer implements bayesian methods. Note that both bayesian binary and bayesian least squares suffer from this issue. Essentially, pymc3, the underlying package we leverage for building bayesian models, does not allow you to redefine an existing deterministic variable. autoimpute tries to do that when it iterates through imputations. I will work on a fix for this bug this weekend when I'm tackling another issue.

Here's where that pops up in autoimpute. Both the MultipleImputer and the MiceImputer create n SingleImputer instances under the hood. In the MultipleImputer, each of those n_i instances iterates k=1 time. So if you use a bayesian method, the bayesian model variables are created 1 time for each n instances. Perfectly valid. But for the MiceImputer, each n_1 instances of the SingleImputer iterate k=5 (by default) times. So each instance tries to recreate bayesian variables k times, and that throws an error.

kearnz commented 3 years ago

fixed in 0.12.2