CompEvol / beast2

Bayesian Evolutionary Analysis by Sampling Trees
www.beast2.org
GNU Lesser General Public License v2.1
237 stars 83 forks source link

Gamma quantiles #343

Closed rbouckaert closed 8 years ago

rbouckaert commented 9 years ago

In BEAUti priors tab gamma [alpha (shape)=0.001 and a beta (scale)=1000

Beauti displays 8.73e-10 for the median. In R

qgamma(.5, shape=.001, scale=1000) [1] 5.244206e-299

Why the discrepancy?

tgvaughan commented 9 years ago

The apache math GammaDistribution class allows you to set the precision of the quantile estimates.

public GammaDistribution(RandomGenerator rng,
                 double shape,
                 double scale,
                 double inverseCumAccuracy)

If you don't set this then the default value of 10^-9 is used (http://commons.apache.org/proper/commons-math/apidocs/src-html/org/apache/commons/math3/distribution/GammaDistribution.html#line.37).

alexeid commented 9 years ago

So, set it to 10^-99? Or increase the exponent by 1 until convergence? Any other ideas?

On 20/05/2015, at 9:43 am, Tim Vaughan notifications@github.com wrote:

The apache math GammaDistribution class allows you to set the precision of the quantile estimates.

public GammaDistribution(RandomGenerator rng, double shape, double scale, double inverseCumAccuracy) If you don't set this then the default value of 10^-9 is used (http://commons.apache.org/proper/commons-math/apidocs/src-html/org/apache/commons/math3/distribution/GammaDistribution.html#line.37 http://commons.apache.org/proper/commons-math/apidocs/src-html/org/apache/commons/math3/distribution/GammaDistribution.html#line.37).

— Reply to this email directly or view it on GitHub https://github.com/CompEvol/beast2/issues/343#issuecomment-103675786.

tgvaughan commented 9 years ago

I vote against increasing the threshold until convergence simply because there's already a root finding algorithm doing this inside the inverseCumulativeProbabiltiy method so we'd wind up with nested root finding methods! Setting it to 10^-320 gets agreement with R's result for Remco's example above. Maybe simply make the tolerance a bit smaller and put a little asterisk next to BEAUti's quantile numbers stating this tolerance? We're always going to be bitten by rounding errors at some point, especially with distributions as evil as Remco's example...

alexeid commented 9 years ago

I agree. Decrease the tolerance. In BEAUti I don’t think there is a cost to decreasing it to a very small number, since the GUI is only doing this calculation infrequently.

Remco’s example is for a distribution that is very often used and I believe it is the default distribution for some parameters. So it would be nice if it worked for that case.

On 20/05/2015, at 10:09 am, Tim Vaughan notifications@github.com wrote:

I vote against increasing the threshold until convergence simply because there's already a root finding algorithm doing this inside the inverseCumulativeProbabiltiy method so we'd wind up with nested root finding methods! Setting it to 10^-320 gets agreement with R's result for Remco's example above. Maybe simply make the tolerance a bit smaller and put a little asterisk next to BEAUti's quantile numbers stating this tolerance? We're always going to be bitten by rounding errors at some point, especially with distributions as evil as Remco's example...

— Reply to this email directly or view it on GitHub https://github.com/CompEvol/beast2/issues/343#issuecomment-103681535.

tgvaughan commented 9 years ago

Hrm, ParametricInputEditor uses the ParametricDistribution instance that is part of the beast model. This maintains a single instance of the apache math Distribution class. The tolerance can only be set via the Distribution constructor, so the only way to get BEAUti to use a different tolerance is to have each ParametricDistribution maintain two distinct Distribution instances. As the abstract ParametricDistribution doesn't know enough to create these instances, code needs to be added to each of its subclasses. Some of these are probably not in the core...

rbouckaert commented 9 years ago

Perhaps we can report in BEAUti that numbers shown in below the density graph are within the tolerance of 1e-9, in case these reported numbers get within twice the tolerance (2e-9). E.g. by adding an asterisk to the number and a line below the last number.

alexeid commented 9 years ago

We should decrease the tolerance substantially. The tolerance is too high at the moment. Gamma(0.001, 1000) is a quite standard prior.

On 21/05/2015, at 10:28 am, Remco Bouckaert notifications@github.com wrote:

Perhaps we can report in BEAUti that numbers shown in below the density graph are within the tolerance of 1e-6, in case these reported numbers get within twice the tolerance (2e-6). E.g. by adding an asterisk to the number and a line below the last number.

— Reply to this email directly or view it on GitHub https://github.com/CompEvol/beast2/issues/343#issuecomment-104061962.

tgvaughan commented 9 years ago

If accurately reporting the quantiles for Gamma(0.001, 1000) is required, we need to force a tolerance of at most 10^-300. Is this wise?

alexeid commented 9 years ago

What is the downside? Has anybody investigated the computational costs?

On 21/05/2015, at 12:23 pm, Tim Vaughan notifications@github.com wrote:

If accurately reporting the quantiles for Gamma(0.001, 1000) is required, we need to force a tolerance of at most 10^-300. Is this wise?

— Reply to this email directly or view it on GitHub https://github.com/CompEvol/beast2/issues/343#issuecomment-104078227.

tgvaughan commented 9 years ago

Good question. I just get a dirty feeling when setting tolerances that close to Double.MIN_VALUE :-)

rbouckaert commented 9 years ago

This gives results for small values, but does not affect larger values:

  1. set max iteration in BrentSolver to 1000
  2. replace in AbstractContinuousDistribution.inverseCumulativeProbability

the call

      double root = UnivariateRealSolverUtils.solve(rootFindingFunction,
                bracket[0], bracket[1], getSolverAbsoluteAccuracy());

with an adaptive solver accuracy, like so:

        double eps = 10.0 * getSolverAbsoluteAccuracy();
        double root = 0;
        while (Math.abs(root) < 1000*eps && eps > 1e-320) {
            eps /= 10.0;
                root = UnivariateRealSolverUtils.solve(rootFindingFunction, bracket[0], bracket[1], eps);
        }
        if (Math.abs(root) < eps) {
            root = 0;
        }

This gives the correct answer for the originally posed problem.

tgvaughan commented 9 years ago

Great news, Remco. We'll still need that asterisk though! :-)

alexeid commented 9 years ago

Is there a speed cost for that solution?

Sent from my iPhone

On 21/05/2015, at 6:01 pm, Remco Bouckaert notifications@github.com wrote:

This gives results for small values, but does not affect larger values:

  1. set max iteration in BrentSolver to 1000
  2. replace in AbstractContinuousDistribution.inverseCumulativeProbability

the call

double root = UnivariateRealSolverUtils.solve(rootFindingFunction, bracket[0], bracket[1], getSolverAbsoluteAccuracy()); with an adaptive solver accuracy, like so:

double eps = 10.0 * getSolverAbsoluteAccuracy();
double root = 0;
while (Math.abs(root) < 1000*eps && eps > 1e-320) {
    eps /= 10.0;
        root = UnivariateRealSolverUtils.solve(rootFindingFunction, bracket[0], bracket[1], eps);
}
if (Math.abs(root) < eps) {
    root = 0;
}

This gives the correct answer for the originally posed problem.

— Reply to this email directly or view it on GitHub.

rbouckaert commented 9 years ago

There is a speed cost when a cummulative distribution is close to 0 (<1e-9).

I am not aware of any call to inverseCumulativeProbability from anything during the MCMC, except for the relaxed clock model (and non-default Gamma site models, though probably only for large values). Other calls are from pre- (like BEAUti) and post-processing tools where speed does not matter much anyway.

A quick test with the testRelaxedClock.xml shows there is no measurable actual impact on the relaxed clock though.

rbouckaert commented 8 years ago

by e7a6af5