fix issues in Gamma distributions

peterstangl commented 2 years ago

This PR fixes several issues in the implementation of the Gamma distribution and related distributions:

So far the support of the Gamma distribution has extended from 0 (potentially shifted by the loc parameter) to the x value where the CDF is 1-2e-9. For a sizable number of total counts, this range can be very large, which can make convolutions with such a distribution numerically unstable. However, in this case, the probability contained in the region of small x would be completely negligible since the distribution would be very close to a Gaussian distribution with a mode far away from 0. The solution implemented in this PR is to use a support given by the region between the x values at which the CDF is larger than 1e-9 and smaller than 1-1e-9, with the additional requirement that the mode is within the support (e.g. if the number of counts is 0 and the mode is at 0, the lower boundary of the support will also be at 0).
For GeneralGammaDistributionPositive, a warning is raised if the background variance is vanishing or very small. However, this was only done if the number of total counts was not equal to 0. Furthermore, a very small non-vanishing variance was kept finite. The latter point results in the issue that the convolution of a very narrow Gaussian with a Gamma distribution can become numerically unstable and then gives wrong results. The solution of this PR is to set the background variance exactly to 0 if it is below a certain threshold (i.e. if the standard deviation of the Gaussian is at least 100 times smaller than the sqrt of the variance of the gamma distribution), in which case the convolution becomes the trivial convolution of the gamma distribution with a delta distribution. Furthermore, this treatment is extended to the case in which the number of total counts equals 0.
For GeneralGammaDistributionPositive, the convolution of the Gamma distribution with a Gaussian is problematic in particular if the background variance is sizable and the number of total counts is very small or zero. In this case, the mode of the initial gamma distribution is shifted by the convolution from being close or equal to zero to larger values. This can even lead to cases where, after the convolution, values of x close or equal to zero get excluded by a certain number of sigmas even though the initial gamma distribution has its mode exactly at zero. So instead of having an upper bound on x as expected, one can end up with a distribution that strongly disfavors vanishing values of x. The reason for this is that by convoluting with a Gaussian, part of the probability content of the initial gamma distribution close to zero is moved to negative values, thereby reducing the probability of small positive values of x. This reduction remains after the final distribution is restricted to positive values of x. The solution of this PR is to use a folded normal distribution (see e.g. https://en.wikipedia.org/wiki/Folded_normal_distribution) for the convolution instead of a normal distribution. The folded normal distribution is equal to a normal distribution for large positive values of x and becomes a half-normal distribution for x=0. Consequently, if the mode of the initial Gamma distribution is at x=0, it will stay there and the resulting distribution will still be an upper bound. This procedure is implemented by first "mirroring" the gamma distribution at 0 (potentially shifted by the loc parameter), then convoluting the resulting distribution with a conventional normal distribution and then restricting the result to positive values of x.

peterstangl commented 2 years ago

@DavidMStraub if you have time it would be great if you can take a quick look at this PR :)

peterstangl commented 1 year ago

@DavidMStraub did you actually also take a look at the second and third point described above? I think this PR should be fine and I will merge it soon.

peterstangl commented 1 year ago

This PR has again been checked by @alekssmolkovic and myself. So I think it can be merged.

flav-io / flavio

fix issues in Gamma distributions #190