propose new `ConstrainedUniform` distribution

drbenvincent commented 2 years ago

Following the discussion #5066 (with input from @aseyboldt), I propose a new ConstrainedUniform distribution. An ideal use case for this is as a distribution for cutpoints in an ordinal regression.

The idea is that the distribution would be a vector of N>2 uniformly distributed values where the first entry is constrained to take on a given min value and the final entry is constrained to take on a given max value. The first value is hard constrained to take on a particular value, and the final value is also constrained to take on a particular value - because of the use of the Dirichlet (which sums to 1). Use of the cumsum was also found to be necessary to enforce ordering and avoid divergences in sampling.

A ConstrainedUniform(N) would have N-2 degrees of freedom.

Minimum working example

The new distribution would be along these lines

def constrainedUniform(N, min=0, max=1):
    return pm.Deterministic('theta',
                             at.concatenate([
                                 np.ones(1)*min,
                                 at.extra_ops.cumsum(pm.Dirichlet("theta_unknown", a=np.ones(N-2))) (min+(max-min))
                             ])
                           )

If you had the following observations

y = np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3,
       3, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 6])

K = len(np.unique(y))
print(f"K = {K} categories")

Then you could have a pleasingly concise ordinal regression model

with pm.Model() as model:
    theta = constrainedUniform(K)
    mu = pm.Normal('mu', mu=K/2, sigma=K)
    sigma = pm.HalfNormal("sigma", 1)
    pm.OrderedProbit("y_obs", cutpoints=theta, eta=mu, sigma=sigma, observed=y)

Note that this amounts to a very simple 'intercept only' model, where mu is this intercept. There are no predictor variables in this example

Implementation issues

When converting that function constrainedUniform into a PyMC distribution, then at the very least, you'd have to consider:

we get some divergences when moving away from min=0 and max=1 it seems
receive coords
I've not gone through this, but @lucianopaz points out there might be something to learn from this blog post https://betanalpha.github.io/assets/case_studies/ordinal_regression.html#22_Surgical_Cut.
Possibly relevant to the above is the numpyro.SimplexToOrderedTransform
think about usage in a hierarchical ordinal regression context... shape?

tomicapretto commented 2 years ago

Interesting post about zero-sum constrain here https://discourse.mc-stan.org/t/test-soft-vs-hard-sum-to-zero-constrain-choosing-the-right-prior-for-soft-constrain/3884/27

ricardoV94 commented 2 years ago

Sounds like a good candidate for pymc-experimental

ricardoV94 commented 9 months ago

This should be a simple helper that looks like a PyMC distribution (but is not one) with new and dist methods

pymc-devs / pymc-experimental

propose new `ConstrainedUniform` distribution #32

Minimum working example

Implementation issues