Closed tomicapretto closed 4 months ago
I found a similar error message in a pretty big model I'm working with, and I think it's related to this same issue. I see the same error when I try to sample from the posterior predictive when I use pm.Truncated
together with pm.CustomDist
. See the following example:
import numpy as np
import pymc as pm
import pytensor.tensor as pt
from pymc.model.fgraph import clone_model
# simulate data
y_values = pm.draw(pm.Truncated.dist(pm.Exponential.dist(scale=[2, 4, 6]), upper=7), 200, random_seed=1234)
y_values = y_values.T.flatten()
groups = list("ABC")
groups_idx = np.repeat([0, 1, 2], 200)
assert len(y_values) == len(groups_idx)
coords = {
"group": groups,
"__obs__": np.arange(len(y_values))
}
# Works
with pm.Model(coords=coords) as model:
groups_idx_data = pm.Data("groups_idx", groups_idx, dims="__obs__")
b = pm.Normal("b", dims="group")
scale = pm.Deterministic("scale", b[groups_idx_data], dims="__obs__")
value_latent = pm.Exponential.dist(scale=scale)
value = pm.Truncated("value", value_latent, upper=7, observed=y_values, dims="__obs__")
idata = pm.sample(chains=2, random_seed=1234)
# Works
new_coords = {
"__obs__": np.arange(3) + 100,
}
new_data = {
"groups_idx": np.array([0, 1, 2])
}
with clone_model(model) as c_model:
pm.set_data(new_data, coords=new_coords)
predictions = pm.sample_posterior_predictive(
idata,
var_names=["value"],
predictions=True,
random_seed=1234,
)
# Fails
new_coords = {
"__obs__": np.arange(3) + 100,
}
new_data = {
"groups_idx": np.array([0, 1, 2])
}
def f_exp(scale, size):
return pm.Exponential.dist(scale=scale, size=size)
with clone_model(model) as c_model:
pm.set_data(new_data, coords=new_coords)
b = c_model["b"]
groups_idx_data = c_model["groups_idx"]
scale = pm.Deterministic("b_new", b[groups_idx_data], dims="__obs__")
value_latent = pm.CustomDist.dist(scale, dist=f_exp)
pm.Truncated("value_new", value_latent, upper=7, dims="__obs__")
predictions = pm.sample_posterior_predictive(
idata,
var_names=["value_new"],
predictions=True,
random_seed=1234,
)
ValueError: All variables needed to compute inner-graph must be provided as inputs under strict=True. The inner-graph implicitly depends on the following shared variables [RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7415037912A0>), group, groups_idx]
Is the problem gone with the bugfix?
Nope, I installed from main and I still had the problem.
Is the clone_model
stuff needed to reproduce the problem?
Nope, the following also fails with the same message
# Fails
new_coords = {
"__obs__": np.arange(3) + 100,
}
new_data = {
"groups_idx": np.array([0, 1, 2])
}
def f_exp(scale, size):
return pm.Exponential.dist(scale=scale, size=size)
with model:
pm.set_data(new_data, coords=new_coords)
b = model["b"]
groups_idx_data = model["groups_idx"]
scale = pm.Deterministic("b_new", b[groups_idx_data], dims="__obs__")
value_latent = pm.CustomDist.dist(scale, dist=f_exp)
pm.Truncated("value_new", value_latent, upper=7, dims="__obs__")
predictions = pm.sample_posterior_predictive(
idata,
var_names=["value_new"],
predictions=True,
random_seed=1234,
)
If you can get an even smaller example without the set_data
/ multiple models that's even better :)
@ricardoV94 I'll try :)
Here is a MWE:
import pymc as pm
def f_exp(scale, size):
return pm.Exponential.dist(scale=scale, size=size)
with pm.Model() as model:
b = pm.Normal("b", shape=(3,))
value_latent = pm.CustomDist.dist(b[[0, 0, 1, 1, 2, 2]], dist=f_exp)
pm.Truncated("value_new", value_latent, upper=7)
Describe the issue:
The same model results in an error depending on the usage of a deterministic. See the examples below.
Reproduceable code example:
Error message:
PyMC version information:
PyMC = 5.14.0 PyTensor = 2.20.0
Context for the issue:
I'm trying to finalize a large refactor in Bambi and some tests about HurdlePoisson didn't pass and that's how I found this. The same thing happened with HurdleNegativeBinomial.