pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.49k stars 1.97k forks source link

BUG: Generation of random draws from Weibull returns constant values #7220

Closed tomicapretto closed 2 months ago

tomicapretto commented 3 months ago

Describe the issue:

The generation of random draws from a Weibull distribution is returning the same values when the input is constant along some axis. This does not happen with other distributions such as the Gamma. See the example below.

Reproduceable code example:

import numpy as np
import scipy.special as sp
import pymc as pm

rng = np.random.default_rng(123)

# Simulate mean from an only-intercept model. 2 chains, 100 draws, 5 observations.
# So 'mu' is the same for all the observations (because it's intercept-only)
mu_draws = np.abs(150 + np.dstack([rng.normal(size=(2, 100, 1))] * 5))

# Simulate some alpha values
alpha_draws = np.abs(rng.normal(size=(2, 100, 1)))

# With 'mu' and 'alpha' get 'beta', which is what pm.Weibull needs
beta_draws = mu_draws / sp.gamma(1 + 1 / alpha_draws)

# See the draws, for a given chain and draw, they look all the same!
weibull_draws = pm.draw(pm.Weibull.dist(alpha=alpha_draws, beta=beta_draws))
weibull_draws

print((weibull_draws == weibull_draws[:, :, 0][..., None]).all())
# True --> they're in fact all the same

You can see this is not the case for the gamma distribution

gamma_draws = pm.draw(pm.Gamma.dist(alpha=alpha_draws, beta=beta_draws))
(gamma_draws == gamma_draws[:, :, 0][..., None]).all() # False

Error message:

No response

PyMC version information:

5.11.0

Context for the issue:

Someone reported this in Bambi after they saw a warning with az.plot_ppc() https://github.com/bambinos/bambi/discussions/788

tomicapretto commented 3 months ago

I think this is where the problem occurs:

https://github.com/pymc-devs/pymc/blob/89f6fcf751774fb50016561dc448a87fba7ed3aa/pymc/distributions/continuous.py#L2491-L2493

See below

import numpy as np

rng = np.random.default_rng(1234)

alpha = np.abs(rng.normal(size=(100, 20, 1)))
beta = np.abs(np.dstack([rng.normal(size=(100, 20, 1))] * 5))

alpha_draws = rng.weibull(alpha, size=None)
alpha_draws.shape # (100, 20, 1)

np.asarray(beta * alpha_draws).shape # (100, 20, 5)

But it's reusing the same that's sampled from rng.weibull along the last axis, and since beta is constant along that axis, the sampled values are repeated.

ricardoV94 commented 3 months ago

Probably needs a

if size is None:
  size = np.broadcast_shapes(alpha.shape, beta.shape)

(Untested code)