Open DrOncogene opened 1 year ago
Hi. Sorry that you're having trouble. I've included tutorials on how to write Bayesian networks here: https://pomegranate.readthedocs.io/en/latest/tutorials/B_Model_Tutorial_6_Bayesian_Networks.html
This documentation is linked in two formats -- as a page on the documentation, and a link to the tutorials folder -- as the first line in the README after a note describing that there are differences. Additionally, in the examples
folder there is another example of a Bayesian network that might be helpful.
I have not written an explicit guide for how to rewrite models but it should be fairly straightforward to convert that to the new format. The biggest differences are simply not needing to use Node
or State
objects and having the probability distributions being d
dimensional tensors instead of a list of lists and not needing to bake
in the end. These changes are written out in a section of the README: https://github.com/jmschrei/pomegranate#high-level-changes
Thanks for the prompt response. Please how can I add the labels to Categorical distribution for example?
A design choice that I made with the latest version is to only accept integer labels in the modeling step, similar to scikit-learn and other ML repositories. You can keep lists of labels on your end and index into them at the end.
Unfortunately this isn't enough guidance for me to port the class example to pomegranate-1.0+. My lack of familiarity with the topic makes the tutorials and examples useless for converting to the new representation. I've gotten this far:
from pomegranate.bayesian_network import BayesianNetwork
from pomegranate.distributions import Categorical, JointCategorical
# Rain node has no parents
orig_rain_probs = {"none": 0.7, "light": 0.2, "heavy": 0.1}
rain_probs = [[0.7, 0.2, 0.1]]
rain = Categorical(rain_probs)
# Track maintenance node is conditional on rain
orig_maint_probs = probs = [
["none", "yes", 0.4],
["none", "no", 0.6],
["light", "yes", 0.2],
["light", "no", 0.8],
["heavy", "yes", 0.1],
["heavy", "no", 0.9],
]
maint_probs = {0: 0.4, 1: 0.6, 2: 0.2, 3: 0.8, 4: 0.1, 5: 0.9}
maintenance = JointCategorical(maint_probs, [rain.distribution])
# Train node is conditional on rain and maintenance
orig_train_probs = [
["none", "yes", "on time", 0.8],
["none", "yes", "delayed", 0.2],
["none", "no", "on time", 0.9],
["none", "no", "delayed", 0.1],
["light", "yes", "on time", 0.6],
["light", "yes", "delayed", 0.4],
["light", "no", "on time", 0.7],
["light", "no", "delayed", 0.3],
["heavy", "yes", "on time", 0.4],
["heavy", "yes", "delayed", 0.6],
["heavy", "no", "on time", 0.5],
["heavy", "no", "delayed", 0.5],
]
train_probs = {
0: 0.8,
1: 0.2,
2: 0.9,
3: 0.1,
4: 0.6,
5: 0.4,
6: 0.7,
7: 0.3,
8: 0.4,
9: 0.6,
10: 0.5,
11: 0.5,
}
train = JointCategorical(
train_probs, [rain.distribution, maintenance.distribution]
)
# Appointment node is conditional on train
original_appointment_probs = [
["on time", "attend", 0.9],
["on time", "miss", 0.1],
["delayed", "attend", 0.6],
["delayed", "miss", 0.4],
]
appointment_probs = {0: 0.9, 1: 0.1, 2: 0.6, 3: 0.4}
appointment = JointCategorical(appointment_probs, [train.distribution])
# Create a Bayesian Network and add states
model = BayesianNetwork()
# model.add_states(rain, maintenance, train, appointment)
model.add_distributions([rain, maintenance, train, appointment])
# Add edges connecting nodes
model.add_edge(rain, maintenance)
model.add_edge(rain, train)
model.add_edge(maintenance, train)
model.add_edge(train, appointment)
But I receive this error:
Traceback (most recent call last):
File "/home/cs50ai/src/harvard_cs50_ai/week2/bayesnet/likelihood.py", line 1, in <module>
from model import model
File "/home/cs50ai/src/harvard_cs50_ai/week2/bayesnet/model.py", line 19, in <module>
maintenance = JointCategorical(maint_probs, [rain.distribution])
File "/home/cs50ai/src/harvard_cs50_ai/week2/.venv/lib64/python3.10/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'Categorical' object has no attribute 'distribution'
I don't see an applicable distribution type for <0.7, 0.2, 0.1>
because that is the distribution for that variable. I also don't see how to set it within Categorical
.
Hi @kronenpj
Sorry you're encountering issues. I think there are a few issues with your code.
(1) JointCategorical
doesn't accept dictionaries, it accepts a tensor with d dimensions. If you five examples with four features each, you'd normally store that in a matrix with shape=(5,4). Here, you'd represent that data in a tensor of shape=(2,2,2,2) -- 2 for the number of possibilities in each feature, 4 dimensions for the 4 features. See the documentation: https://github.com/jmschrei/pomegranate/blob/master/pomegranate/distributions/joint_categorical.py#L34
(2) BayesianNetwork
doesn't accept JointCategorical
, only Categorical
and ConditionalCategorical
. Remember that when you're defining the network you're defining the source nodes, Categorical
distributions, and the internal nodes, ConditionalCategorical
ones. See https://github.com/jmschrei/pomegranate/blob/master/examples/Bayesian_Network_Monty_Hall.ipynb for how to format your Categorical
and ConditionalCategorical
distributions.
(3) When making these distributions you no longer need to pass in the parent distributions into the child distributions directly. This is handled by the BayesianNetwork
object. You just need to pass in the probabilities.
Let me know if you have any other questions.
Hi @jmschrei
ConditionalCategorical
is poorly documented. Can you please explain more about how to use it in Bayesian network? How does each entry in the input distribution of ConditionalCategorical
correspond to the other nodes?
I was attending CS50AI course and the chapter on Bayesian networks uses this library. However the code was way back from around v0.8.1. Classes like Node were still in use. I suffered trying to look for a guide or at least the docs for v0.14.8 or v0.8.1.
No such documentation exists not even on the official docs website. Kindly point me to any if it exist or help me convert the below code to the latest guide. I should be able to pick it up from there.