pymc-devs / pymc4

Experimental PyMC interface for TensorFlow Probability. Official work on this project has been discontinued.
Apache License 2.0
713 stars 113 forks source link

Factories for distributions that can be reparametrized #240

Open luke14free opened 4 years ago

luke14free commented 4 years ago

It would be very nice to implement a pattern where we can have factories for distributions that can be re-parametrized in terms of scale/loc. A good example would be the beta where I'd love to be able to write something like:

pm.Beta.from_loc_scale(name='beta', loc=..., scale=...)

This could become very handy in a number of cases (I was thinking about regressions, but there are so much more)

tirthasheshpatel commented 4 years ago

This can be done easily and is very handy!

Quick fix straight from wikipedia ```python @staticmethod def from_loc_scale(name, loc, scale, **kwargs): """ Beta distribution from `loc` and `scale` parameters. Parameters ---------- loc : tensor, float Mean of the distribution scale : tensor, float Variance of the distribution """ nu = (loc * (1 - loc)) / scale - 1 if tf.reduce_any(nu < 0): raise ValueError("invaid value for `loc` or `scale`") alpha = loc * nu beta = (1 - loc) * nu return Beta(nme=name, concentration0=alpha, concentration1=beta, **kwargs) ```

I wonder if this can be done for multivariate distributions though (or if it even makes sense to do so)? What do you say, @luke14free?

luke14free commented 4 years ago

Yes, static methods for the win here. I am not sure how useful it would be to have this on multivariate distributions (there might be usecases but they don't pop out immediately in my head). My point was having to avoid recomputing simple transformation all times (I managed to introduce a couple of stupid bugs by transcribing the wrong transformations from paper to code in the past).

Maybe it would make sense to have it for multivariate like Dirichlet and Multinomials, while the most used ones like Multivariate Gaussians and T-student are already express in terms of mean/scale.

lucianopaz commented 4 years ago

We plan to allow a single parametrization in the each distribution instance's initialization function. Pymc3 supported multiple parametrizations in __init__ (e.g. the Normal) and that made things harder to maintain. That being said, @luke14free, your idea of having a static factory method do this automatically is a perfectly valid approach. We just need to agree on the design here. I think that the simplest way to do this would be to implement these static methods in each distribution instance that needs them, but that would lead to essentially duplicate code in many places and would be harder to maintain. Maybe there could be some base classes that implement common reparametrizations (e.g. a normal's scale and precision) and have the appropriate classes inherit from these. I would like to hear what the others think. @twiecki, @junpenglao?

twiecki commented 4 years ago

Yeah, I like the static method approach. In PyMC3 we just supported multiple kwargs which didn't work terribly either, usually there are not more than 2 parameterizations. Any reason not to do that?