pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.71k stars 2.01k forks source link

Request: Additional parametrizations #4600

Open ghost opened 3 years ago

ghost commented 3 years ago

I frequently find myself missing one way or another to parametrize a distribution. Examples of this include:

  1. Beta distribution in terms of n = α + β and μ
  2. Inverse gamma in terms of a scale and degrees of freedom parameter (scaled inverse chi-square)
  3. Exponential in terms of its mean
ricardoV94 commented 3 years ago

Thanks for the feedback. The last one will be added soon. For the others would you be interested in doing a PR? can you give more context on why these are particularly useful compared to what we have now? Perhaps some examples where they naturally crop up.

ghost commented 3 years ago

Thanks for the feedback. The last one will be added soon. For the others would you be interested in doing a PR?

Unfortunately I'm pretty busy, but they should both be very easy. The scaled inverse chi-square can be implemented as one over a chi-squared distributed variable times a scaling parameter, or as an inverse gamma with parameters ν/2, ν*τ^2/2 (where ν is the degrees of freedom).

7ayushgupta commented 3 years ago

Hey! I can try to send an update for adding a mean and sample size parameterization for the Beta distribution. I'll look at the scaled inverse chi-square after that.

7ayushgupta commented 3 years ago

For the scaled inverse chi-squared should we create a new continous.ScaledInverseChiSquared or provide some kind of an option in the current definitions of continous.InverseGamma?

ricardoV94 commented 3 years ago

Is the alternative beta parametrization described here common?

ckrapu commented 3 years ago

Is the alternative beta parametrization described here common?

I find it to be useful in placing priors over parameters that are directly interpretable as probabilities and am currently working on models that use it. That way, setting n=10 and mu=0.9 is really easy to interpret as a belief with 10 pseudocounts' worth of strength that the value is 0.9.

ghost commented 3 years ago

Is the alternative beta parametrization described here common?

I find it to be useful in placing priors over parameters that are directly interpretable as probabilities and am currently working on models that use it. That way, setting n=10 and mu=0.9 is really easy to interpret as a belief with 10 pseudocounts' worth of strength that the value is 0.9.

Yep, I love this interpretation of it. Another PyMC3-related advantage is that this doesn't have the funky problem where if you try to set a prior on sd it's almost impossible, because sd is bounded in a way that depends on mu. Instead you can let n do the job of handling how dispersed you want your distribution to be and set a prior on that, while mu can be set using e.g. a logit regression.

If this is extended to the Beta Binomial, it also helps with underdispersed data. If you reparametrize the Beta Binomial as a function of n, γ=(α+β)^-1 (one over the pseudocount which we were calling n earlier), and p=mu/n=α/(α+β) you can model underdispersed data by dropping the constraint that γ must be positive. I would have found this useful today while I was trying to model an underdispersed count in my data.

ghost commented 3 years ago

Worth mentioning -- if we want to add parametrizations using a dispersion parameter that can be set to a negative value to model underdispersion, setting an allow_negative flag that's False by default might be a good idea: the underdispersed beta-binomial loses the physical interpretation that it's a binomial where the success probability is drawn from a beta. Given that underdispersion is rare in most data sets and most people using a beta-binomial are looking to model overdispersion, the default should be to disallow negative values. (It just happens that in some cases, you're adding anti-correlated trials so it's not actually wrong).

ghost commented 3 years ago

Is the alternative beta parametrization described here common?

I just spotted the reparametrization of the Beta distribution in terms of γ and p out in the wild on the Discourse, where it was used to improve inference dramatically (~300 divergences out of 1000 down to 0). I think it's a good idea to add it as another parametrization for the beta and beta-binomial (and allow negative values of γ for the beta-binomial by setting a flag).

ghost commented 3 years ago

For the scaled inverse chi-squared should we create a new continous.ScaledInverseChiSquared or provide some kind of an option in the current definitions of continous.InverseGamma?

Sorry I missed this. I feel like ScaledInverseChiSquared is quite a mouthful/hard to type. Since InverseGamma is the more popular name, I would stick with that.