pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.67k stars 2k forks source link

`dataset_to_point_list` fails when chain, draw are not the leading dims #7178

Closed ricardoV94 closed 7 months ago

ricardoV94 commented 7 months ago

Description

import pymc as pm

with pm.Model(coords={"trial": [0]}) as m:
    x = pm.Normal("x", shape=(1,), dims="trial")
    y = pm.Normal("y", x, observed=[5], dims="trial")

    idata = pm.sample(tune=0, draws=10)
    idata = idata.posterior.transpose("chain", "trial", "draw")
    pm.sample_posterior_predictive(idata)  # IndexError: index 2 is out of bounds for axis 0 with size 2

CC @OriolAbril

OriolAbril commented 7 months ago

I am actually surprised it breaks there and not in the reshaping part. The culprit is clearly this line:

https://github.com/pymc-devs/pymc/blob/6252d2e58dc211c913ee2e652a4058d271d48bbd/pymc/util.py#L253

we transpose the dataarray to have everything work independently of the dimension order, but then the reshape uses the yet-to-be-transposed dataarray to get the shape of the final array.