pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.47k stars 1.97k forks source link

BUG: passing the 'max_n_steps' parameter as kwarg to HurdleNegativeBinomial distribution does not work #7307

Closed realityjunky closed 1 month ago

realityjunky commented 1 month ago

Describe the issue:

Passing the max_n_steps parameter as a kwarg to pm.HurdleNegativeBinomial, a mixture distribution that involves a Truncated part, raised a TypeError indicating that it is not successfully passed to the Truncated part somehow.

Reproduceable code example:

import pymc as pm
import pytensor as pt
with pm.Model():
    ad_nb = pm.HurdleNegativeBinomial('ad_nb', psi=.1, n=4000, p=1 - 5.8 * 1e-5, max_n_steps=10000)
    prior = pm.sample_prior_predictive(samples=100)

# while the following hand-made mixture works
nonzero_p = .1
with pm.Model():
    nonzero_p = pt.as_tensor_variable(nonzero_p)
    weights = pt.stack([1 - nonzero_p, nonzero_p], axis=-1)
    comp_dists = [
        pm.DiracDelta.dist(0),
        pm.Truncated.dist(pm.NegativeBinomial.dist(p=1-6*1e-5, n=4000), lower=1, max_n_steps=10000),
    ]
    pm.Mixture('ads', weights, comp_dists)
    prior = pm.sample_prior_predictive(samples=100)

Error message:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[8], line 4
      1 with pm.Model():
      2     # ad_nb = pm.NegativeBinomial('ad_nb', n=[4000, 3000], 
      3     #                             p=[1 - 5.8 * 1e-5, 1 - 6.8 * 1e-5])
----> 4     ad_nb = pm.HurdleNegativeBinomial('ad_nb', psi=1 - 0.7, n=[4000, 3000], 
      5                                 p=[1 - 5.8 * 1e-5, 1 - 6.8 * 1e-5], max_n_steps=10000)
      6     prior = pm.sample_prior_predictive(samples=100)

File /opt/conda/envs/pm5/lib/python3.11/site-packages/pymc/distributions/mixture.py:926, in HurdleNegativeBinomial.__new__(cls, name, psi, mu, alpha, p, n, **kwargs)
    925 def __new__(cls, name, psi, mu=None, alpha=None, p=None, n=None, **kwargs):
--> 926     return _hurdle_mixture(
    927         name=name,
    928         nonzero_p=psi,
    929         nonzero_dist=NegativeBinomial.dist(mu=mu, alpha=alpha, p=p, n=n),
    930         dtype="int",
    931         **kwargs,
    932     )

File /opt/conda/envs/pm5/lib/python3.11/site-packages/pymc/distributions/mixture.py:836, in _hurdle_mixture(name, nonzero_p, nonzero_dist, dtype, **kwargs)
    830 comp_dists = [
    831     DiracDelta.dist(zero),
    832     Truncated.dist(nonzero_dist, lower=lower),
    833 ]
    835 if name is not None:
--> 836     return Mixture(name, weights, comp_dists, **kwargs)
    837 else:
    838     return Mixture.dist(weights, comp_dists, **kwargs)

File /opt/conda/envs/pm5/lib/python3.11/site-packages/pymc/distributions/distribution.py:411, in Distribution.__new__(cls, name, rng, dims, initval, observed, total_size, transform, *args, **kwargs)
    408     elif observed is not None:
    409         kwargs["shape"] = tuple(observed.shape)
--> 411 rv_out = cls.dist(*args, **kwargs)
    413 rv_out = model.register_rv(
    414     rv_out,
    415     name,
   (...)
    420     initval=initval,
    421 )
    423 # add in pretty-printing support

File /opt/conda/envs/pm5/lib/python3.11/site-packages/pymc/distributions/mixture.py:221, in Mixture.dist(cls, w, comp_dists, **kwargs)
    216     raise ValueError(
    217         f"Mixture components must all have the same support dimensionality, got {components_ndim_supp}"
    218     )
    220 w = pt.as_tensor_variable(w)
--> 221 return super().dist([w, *comp_dists], **kwargs)

File /opt/conda/envs/pm5/lib/python3.11/site-packages/pymc/distributions/distribution.py:488, in Distribution.dist(cls, dist_params, shape, **kwargs)
    486 ndim_supp = getattr(cls.rv_op, "ndim_supp", None)
    487 if ndim_supp is None:
--> 488     ndim_supp = cls.rv_op(*dist_params, **kwargs).owner.op.ndim_supp
    489 create_size = find_size(shape=shape, size=size, ndim_supp=ndim_supp)
    490 rv_out = cls.rv_op(*dist_params, size=create_size, **kwargs)

TypeError: Mixture.rv_op() got an unexpected keyword argument 'max_n_steps'

PyMC version information:

pymc==5.11.0 pytensor==2.18.6 MacOS Sonoma 14.4.1 Installation: conda

Context for the issue:

I'm not sure if it's not just something wrong with HurdleNegativeBinomial/max_n_steps, or possibly something deeper affecting Hurdle models in general. It could be a hurdle for people relying on using Hurdle models, which to me is a key feature of PyMC compared to other tools.

welcome[bot] commented 1 month ago

Welcome Banner] :tada: Welcome to PyMC! :tada: We're really excited to have your input into the project! :sparkling_heart:
If you haven't done so already, please make sure you check out our Contributing Guidelines and Code of Conduct.

ricardoV94 commented 1 month ago

This should be an easy issue if someone wants to take it