jmschrei / pomegranate

Fast, flexible and easy to use probabilistic modelling in Python.
http://pomegranate.readthedocs.org/en/latest/
MIT License
3.36k stars 587 forks source link

Bayesian Networks training from samples with incomplete information. #820

Closed ingjpal closed 4 years ago

ingjpal commented 4 years ago

Hello Jacob,

One of the added value of bayesian statistics is that conditional probability tables can come from different data bases so that the evidence does not have to be present for every single instance. I was wondering if the Bayesian Networks from pommegrande will accept for the from_samples training method an imcomplete data set. Something for example like:

[['A', 'B', None], ['A', None, 'C'], [None, 'B', 'A'], ['A' 'B' 'C'], ['A' 'B' 'None'] ['A' 'B' 'A']]

Cheers,

Juan

jmschrei commented 4 years ago

Yes, you can learn Bayesian networks on incomplete data sets. See https://github.com/jmschrei/pomegranate/blob/master/tutorials/C_Feature_Tutorial_4_Missing_Values.ipynb