jmschrei / pomegranate

Fast, flexible and easy to use probabilistic modelling in Python.
http://pomegranate.readthedocs.org/en/latest/
MIT License
3.38k stars 590 forks source link

[BUG] Incorrect Prediction Output in Bayesian Network #1115

Closed omarMahamud closed 3 months ago

omarMahamud commented 3 months ago

'm experiencing an issue with the pomegranate library while working with a Bayesian Network. Despite setting up the Conditional Probability Table (CPT) correctly, the output for a specific node is not as expected.

Expected Behavior:

When predicting the humidity given that the weather is sunny, the model should output normal as the most likely condition based on the following CPT:
    sunny → high: 0.4
    sunny → normal: 0.6

Actual Behavior:

The model instead outputs high as the most likely condition, with the probabilities:
    high: 0.5999999999999999
    normal: 0.4000000000000001

This issue occurs even though the CPT clearly indicates that normal should have a higher probability when the weather is sunny. To Reproduce

Here’s a minimal code snippet that reproduces the issue:

python

from pomegranate import BayesianNetwork, DiscreteDistribution, ConditionalProbabilityTable, Node

Define the probabilities for the weather

weather = Node(DiscreteDistribution({ "sunny": 0.7, "rainy": 0.3 }), name="weather")

temperature = Node(DiscreteDistribution({ "hot": 0.2, "mild": 0.7, "cold": 0.1 }), name="temperature") Define the probabilities for humidity.

humidity = Node(DiscreteDistribution({ "high": 0.6, "normal": 0.4 }), name="humidity")

Define the probabilities for weather_temperature.

weather_temperature = ConditionalProbabilityTable([

["sunny", "hot", 0.3], ["sunny", "mild", 0.6], ["sunny", "cold", 0.1], ["rainy", "hot", 0.1], ["rainy", "mild", 0.4], ["rainy", "cold", 0.5]

], [weather.distribution])

Define the probabilities for weather_humidity.

Create nodes for temperature and humidity.

Define the probabilities for weather_humidity

weather_humidity = ConditionalProbabilityTable([ ["sunny", "high", 0.4], ["sunny", "normal", 0.6], ["rainy", "high", 0.8], ["rainy", "normal", 0.2] ], [weather.distribution])

Create nodes

humidity_given_weather = Node(weather_humidity, name="humidity_given_weather")

temperature_given_weather = Node(weather_temperature, name="temperature_given_weather") humidity_given_weather = Node(weather_humidity, name="humidity_given_weather")

Create the Bayesian Network

model = BayesianNetwork("Weather Prediction Network") model.add_states(weather, humidity_given_weather) model.add_edge(weather, humidity_given_weather) model.bake()

Predict the most likely humidity when sunny

answer = model.predict_proba({"weather": "sunny"})[1].parameters[0] most_likely_humidity = max(answer, key=answer.get) print("Most likely humidity when sunny:", most_likely_humidity)

Environment: im doing it on vocerium which i get this these verions Python version: 3.7 pomegranate version: 0.14.5 Operating System: [Your OS here, e.g., macOS,

Response time

I would appreciate any assistance with this issue. I understand that responses may be slower during weekends, and I’m happy to wait until then if needed.

jmschrei commented 3 months ago

Sorry that you're encountering issues. This looks like it was made with a version of pomegranate before v1.0.0. I am no longer supporting those versions -- sorry!