jmschrei / pomegranate

Fast, flexible and easy to use probabilistic modelling in Python.
http://pomegranate.readthedocs.org/en/latest/
MIT License
3.35k stars 589 forks source link

Bayesian Network Prediction, tuple index out of range #1109

Open Curvedrain opened 3 months ago

Curvedrain commented 3 months ago

I know this may be difficult to answer without much information, but I wanted to see if this issue has been seen before, as I had difficulty finding any past instances of my issue.

In running predict on a Bayesian network model and inputting a masked tensor with the correct size, I get the error: "tuple index out of range"

This occurs in the following line (386) of factor_graph.py's function predict_proba. shape[l+1] = message.shape[l+1]

I've been having trouble debugging this and any help would be greatly appreciated.

jmschrei commented 2 months ago

Sorry you're encountering issues. Without more information, it's hard for me to diagnose the issue. Have you looked at the documentation and unit tests to make sure you're defining the network correctly?

peter-hoxtonfarms commented 2 months ago

Hi, I have experienced the same issue and have produced a MWE to show the issue.

I am using the following installation:

%watermark -m -n -p torch,pomegranate
torch      : 2.3.1
pomegranate: 1.1.0

Compiler    : Clang 14.0.0 (clang-1400.0.29.202)
OS          : Darwin
Release     : 23.5.0
Machine     : arm64
Processor   : arm
CPU cores   : 8
Architecture: 64bit

The following script reproduces the issue:

import numpy as np
import torch
from pomegranate.bayesian_network import BayesianNetwork
from pomegranate.distributions.categorical import Categorical
from pomegranate.distributions.conditional_categorical import ConditionalCategorical

""""
We model a system like the following:

x -> z <- y

Where we want to predict the probability of x and y given evidence about z.

the CPD for z is as follows:
         y = 0      |       y = 1
z |  x=0   |  x = 1 |  x = 0  | x = 1
0 | 0.999  | 0.2008 | 0.3007  | 0.44056
1 | 0.001  | 0.7992 | 0.6993  | 0.55944

For x and y, p(0) = 0.999 and p(1) = 0.001
"""

x = Categorical([[0.999, 0.001]])
y = Categorical([[0.999, 0.001]])

cpd = np.zeros((2, 2, 2))
cpd[0, 0, :] = [0.999, 0.001]
cpd[0, 1, :] = [0.3007, 0.6993]
cpd[1, 0, :] = [0.2008, 0.7992]
cpd[1, 1, :] = [0.44056, 0.55944]

z = ConditionalCategorical(cpd, n_categories=[2, 2])

model = BayesianNetwork([x, y, z], [[x, z], [y, z]])

X = torch.tensor([[-1, -1, 1]])
X_masked = torch.masked.MaskedTensor(X, mask=X >= 0)
model.predict_proba(X_masked)

Running this script gives me the following error:

OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
/Users/peter/code/hoxton-farms/sandbox/pomegranate_venv/lib/python3.9/site-packages/torch/masked/maskedtensor/core.py:156: UserWarning: The PyTorch API of MaskedTensors is in prototype stage and will change in the near future. Please open a Github issue for features requests and see our documentation on the torch.masked module for further information about the project.
  warnings.warn(("The PyTorch API of MaskedTensors is in prototype stage "
Traceback (most recent call last):
  File "/Users/peter/code/hoxton-farms/research/experimental/peter/pomegranate_mwe.py", line 39, in <module>
    model.predict_proba(X_masked)
  File "/Users/peter/code/hoxton-farms/sandbox/pomegranate_venv/lib/python3.9/site-packages/pomegranate/bayesian_network.py", line 441, in predict_proba
    return self._factor_graph.predict_proba(X)
  File "/Users/peter/code/hoxton-farms/sandbox/pomegranate_venv/lib/python3.9/site-packages/pomegranate/factor_graph.py", line 386, in predict_proba
    shape[l+1] = message.shape[l+1]
IndexError: tuple index out of range

Does this look like an error in the input or a bug in the sum-product implementation?

Thanks for all the help!