pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.71k stars 2.01k forks source link

BUG: Coordinates are not passed correctly to HSGP components #7561

Open juanitorduz opened 5 hours ago

juanitorduz commented 5 hours ago

Describe the issue:

In the example https://www.pymc.io/projects/examples/en/latest/gaussian_processes/GP-Births.html we see in the model diagram that f_trend is a deterministic variable coming from a HSGP-component with coordinate time

    f_trend = gp_trend.prior(
        name="f_trend", X=normalized_time_data[:, None], dims="time"
    )

image

So far so good :)

However: for pymc > 5.16 (I think), this is no longer the case and the coordinate does not longer propagate and it gets its own coordinate f_trend_dim_0. Note this variable is outside the time plate:

image

The model still works but it would be nice to keep the previous version to have consistent coordinates :)

Remark: Even If I change the model to have a non-existing dimension

    f_trend = gp_trend.prior(
        name="f_trend",
        X=normalized_time_data[:, None],
        dims="blabla",
    )

The result is the same.

Reproducible code example:

https://www.pymc.io/projects/examples/en/latest/gaussian_processes/GP-Births.html

Error message:

No response

PyMC version information:

Works fine with

Last updated: Fri Mar 29 2024

Python implementation: CPython
Python version       : 3.11.7
IPython version      : 8.20.0

numpyro : 0.14.0
pytensor: 2.19.0

pandas    : 2.1.4
preliz    : 0.4.1
matplotlib: 3.8.2
pytensor  : 2.19.0
pymc      : 5.12.0
seaborn   : 0.13.2
numpy     : 1.26.3
xarray    : 2024.2.0
arviz     : 0.17.1

Watermark: 2.4.3

the issue commes (at least) with

Last updated: Tue Nov 05 2024

Python implementation: CPython
Python version       : 3.12.7
IPython version      : 8.27.0

numpyro : 0.15.3
pytensor: 2.23.0

xarray    : 2023.6.0
preliz    : 0.11.0
pandas    : 2.2.2
numpy     : 1.26.4
pymc      : 5.16.1
seaborn   : 0.13.2
arviz     : 0.17.1
pytensor  : 2.23.0
matplotlib: 3.9.2
sklearn   : 1.5.1

Watermark: 2.5.0

Context for the issue:

No response

juanitorduz commented 5 hours ago

FYI @aloctavodia ;)

juanitorduz commented 4 hours ago

ah! @AlexAndorra @bwengals I think I found the issue. The dims in HSGP.prior are gp_dims but in HSGPPeriodic.prioraredimsas expected. Shall we simply usedims`? The inconsistency is weird and otherwise a lot of existing code would have this issue.

https://github.com/pymc-devs/pymc/blob/33be4dc3c901d32906fae98837e9bcf502bc7027/pymc/gp/hsgp_approx.py#L427-L435

juanitorduz commented 4 hours ago

If we agree on gp_dims - > dims I can create a PR

bwengals commented 4 hours ago

Definitely agree on gp_dims -> dims. Should the order be swapped on dims and hsgp_coeffs_dims? I think dims is more likely to be used.