erdogant / bnlearn

Python package for Causal Discovery by learning the graphical structure of Bayesian networks. Structure Learning, Parameter Learning, Inferences, Sampling methods.
https://erdogant.github.io/bnlearn
Other
480 stars 46 forks source link

Missing data #81

Open PARODBE opened 1 year ago

PARODBE commented 1 year ago

Hi,

One question, the library have any option for missing data computation like bnlearn of R?

Thanks!

erdogant commented 1 year ago

There are no imputation functions for missing data. But if you create a function that does it (without a lot of dependencies of other packages), feel free to push it!

harrietmwwright commented 1 year ago

Im also looking for this functionality. At the moment if you are trying to make a prediction on a dataset and remove one of the variables it will make the prediction, however, it will error if you provide the variable with a value of NaN. Is this doing some sort of imputation/estimation in the backend?

PARODBE commented 1 year ago

we can use bayes theorem and with the computed posteriors removed from the equation missing data?

erdogant commented 1 year ago

Can you maybe make a small example to demonstrate this? Maybe with the sprinkler data set?

PARODBE commented 1 year ago

I know that pymc3 library do this...I have read It in a hierarchical linear regression using bayesian approach, in this moment I don't remember the article, but this blog shows something like that: http://stronginference.com/missing-data-imputation.html

erdogant commented 1 month ago

Impute functionality implemented in case of missing values. See docs over here.

Update to the latest version with:

pip install -U bnlearn

PARODBE commented 1 month ago

You could include the MICE approach but using the same bayesian model, like using MICE random forest but with this approach, so iteratively you use the input value without missing data to compute the missing data, but with a bayesian model. What do you think?

erdogant commented 1 month ago

This one? https://scikit-learn.org/1.5/modules/generated/sklearn.impute.IterativeImputer.html#sklearn.impute.IterativeImputer

PARODBE commented 1 month ago

More or less, it's an adaptation from the original. But the problem, if I'm not wrong It only supports quantitative data. This, also, there are other options which support categorical data, like this one: https://github.com/AnotherSamWilson/miceforest

But I'm not sure if you can include a bayesian model. I think that if you have build your TAN, FAN or whatever bayesian approach with your library, iteratively you build these models without missing data and predict missing data.

erdogant commented 1 month ago

The MICE functionality has been thanks to contributing of @Ananyapam7. See here for more information.

Update to the latest version with:

pip install -U bnlearn