pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.66k stars 2k forks source link

BUG: Using a length-only coord causes it to be volatile #7376

Closed JasonTam closed 2 months ago

JasonTam commented 3 months ago

Describe the issue:

When a variable is defined with a coord, it's considered volatile. Then sample_posterior_predictive does not transfer the node over. I believe the volatility condition that's met is https://github.com/pymc-devs/pymc/blob/9eaf92ab40ff95cf7f5cb2763377e06bb0d7be0f/pymc/sampling/forward.py#L208-L212

The coord is intended to be immutable, but after https://github.com/pymc-devs/pymc/pull/7047, coords always mutable

Somewhat related:

Reproduceable code example:

import pymc as pm

y_obs = [0] * 3
with pm.Model() as m:
    m.add_coord("coord0", length=1)
    x = pm.Data("x", [0] * 3)
    b = pm.Normal("b", dims="coord0")  # Causes `b` to be volatile
    # b = pm.Normal("b", shape=1)      # Works as expected (`b` is transferred)
    y = pm.Normal("y", b[0] * x, observed=y_obs)
    idata = pm.sample()

    pm.sample_posterior_predictive(trace=idata, var_names='y')

Error message:

Sampling: [b, y]

PyMC version information:

5.15.1

Context for the issue:

Cannot use sample_posterior_predictive as intended

welcome[bot] commented 3 months ago

Welcome Banner] :tada: Welcome to PyMC! :tada: We're really excited to have your input into the project! :sparkling_heart:
If you haven't done so already, please make sure you check out our Contributing Guidelines and Code of Conduct.

ricardoV94 commented 3 months ago

The volatility logic is supposed to check if the values of mutable data actually changed (it's not enough to be mutable) so this is quite surprising

JasonTam commented 3 months ago

Ah I think I understand what the issue is. The condition check for constant_coords is: https://github.com/pymc-devs/pymc/blob/9eaf92ab40ff95cf7f5cb2763377e06bb0d7be0f/pymc/sampling/forward.py#L797-L803

But if you add a coord without any values (just the length)

m.add_coord("coord0", length=1)

then the coord, will be None. We will only have dim_lengths information. https://github.com/pymc-devs/pymc/blob/9eaf92ab40ff95cf7f5cb2763377e06bb0d7be0f/pymc/model/core.py#L1030-L1031

I guess my question now: is this the intended?

ricardoV94 commented 3 months ago

I see, we always always get tricked by coords without values (as in we tend to forget they exist all the time). In that case the volatility check should be whether the length changed or not?

What do you think? And do you want to open a PR to fix it?

JasonTam commented 3 months ago

For sure it threw me in for a loop. So I think it should either be considered constant if the length stays the same, or a log message should be fired.

I've opened a PR https://github.com/pymc-devs/pymc/pull/7381 and we can continue any discussion there