jmschrei / pomegranate

Fast, flexible and easy to use probabilistic modelling in Python.
http://pomegranate.readthedocs.org/en/latest/
MIT License
3.35k stars 589 forks source link

BayesianNetwork edge weights #685

Closed arainboldt closed 4 years ago

arainboldt commented 4 years ago

I'm experimenting with using the BayesianNetwork class for inferring the marginal impact of a variable on a target variable, essentially driver analysis or causal analysis.

To get the marginal impact of each variable on the target variable I expect that I should just take the product of all edges in the path from the given variable to the target variable, should there be such a path.

The problem I'm encountering is that I don't see any clear way to get edge weights I'll need to evaluate a given path. What is the best way to do this?

Currently I'm doing the below, which seems a bit cumbersome:

vals = []
for i in np.arange(10):
    vals.append({X.columns[i]:1})
proba = bn_model.predict_proba(vals)
probs = []
for i in np.arange(len(probs)):
    probs.append(proba[i][-1].sample(10000).mean())
jmschrei commented 4 years ago

What do you mean by edge weights?

arainboldt commented 4 years ago

I mean the coefficients that express the strength of the connection between two nodes.

ghost commented 4 years ago

You may be confusing it with a Markov Chain or a Hidden Markov Model

jmschrei commented 4 years ago

There aren't coefficients that express the strength of the connection between two nodes in a categorical Bayesian network, which is what is implemented in pomegranate. There is no value in the edges themselves. The presence of an edge simply indicates a dependence between two variables and the lack of an edge indicates a conditional independence. For Gaussian Bayesian networks (where the values are numbers instead of categories) there would be edge weights, but not here.

arainboldt commented 4 years ago

thanks for clarifying.