This PR fixes a bug in the model specification, which is present in the original publication and the R package as well.
The standard deviation of a t-distribution is not σ.
Instead, variance = nu / (nu - 2) * sigma ** 2 if nu > 2, and variance = ∞ if 1 < nu <= 2.
Therefore, any posterior sample nu <= 2 assigns a non-zero probability that the variance is ∞, which in real life is never the case.
Proof that SD is underestimated by the R package (Python output is essentially the same):
Samples of nu close to 2 are also problematic, as they blow up the SD estimation. Therefore I changed the sampling of nu as such:
originally: ν ~ Exponential(1/29) + 1
now: ν ~ Exponential(1/27.5) + 2.5
( t(df=2.5) is very similar to t(df=2) when it's not extremely far from the mean.)
The other change concerns the sampling of σ.
Originally, σ was uniformly distributed between σ^ / 1000 and σ^ * 1000 (where σ^ is the sample standard deviation), meaning the prior probability of σ > σ^ was 1000 times that of σ < σ^. This caused an overestimation of σ at small sample sizes.
Now, log(σ) is distributed uniformly between log( σ^ / 1000 ) and log( σ^ * 1000 ).
As a consequence, reduced the displayed limits of the std.devs in
plot_posterior and plot_all to 99.5% of the samples.
(This is part 3 of 4 pull requests that build on top of one another.)
This PR fixes a bug in the model specification, which is present in the original publication and the R package as well.
The standard deviation of a t-distribution is not σ. Instead, variance = nu / (nu - 2) * sigma ** 2 if nu > 2, and variance = ∞ if 1 < nu <= 2. Therefore, any posterior sample nu <= 2 assigns a non-zero probability that the variance is ∞, which in real life is never the case.
Proof that SD is underestimated by the R package (Python output is essentially the same):
Samples of nu close to 2 are also problematic, as they blow up the SD estimation. Therefore I changed the sampling of nu as such:
( t(df=2.5) is very similar to t(df=2) when it's not extremely far from the mean.)
The other change concerns the sampling of σ.
As a consequence, reduced the displayed limits of the std.devs in plot_posterior and plot_all to 99.5% of the samples.
(This is part 3 of 4 pull requests that build on top of one another.)