chjackson / flexsurv

The flexsurv R package for flexible parametric survival and multi-state modelling
http://chjackson.github.io/flexsurv/
54 stars 28 forks source link

Numerical underflow in pgengamma #58

Open chjackson opened 5 years ago

chjackson commented 5 years ago

pgengamma underflows for extreme values of Q:

> pgengamma(c(56.58, 56.59), mu=0.7, sigma=exp(-1.5707), Q=-45.9621)
[1] 0.2968058 1.0000000

This behaviour is inherited from pgamma:

> pgamma(c(4e-324, 4e-325), 0.0004, 1)
[1] 0.742639 0.000000

Perhaps there's a workaround that uses logs in the intermediate calculations done in

https://github.com/chjackson/flexsurv-dev/blob/master/src/gengamma.h

heor-robyoung commented 3 years ago

In this particular case, it's the very low value of the "q" argument to pgamma that's causing the issue as it's getting approximated to 0 and escaping via that route. When q is low (< 1), the "pgamma_smallx" evaluation path is used (referred to Abramowitz and Stegun 6.5.29 [right]; I'm afraid I haven't tracked down the equivalent in the DLMF yet).

In this evaluation, for extremely small values the initial term is so small it lies within the episilon bound and the series is exited immediately. For alpha = 1 / (Q * Q) < 1, the result then reduces to:

exp( ((log(q) - mu) / (sigma) - log(Q * Q)) * 1/Q - log(gamma( 1 + 1 / (Q * Q)) )

It should be noted that the evaluation of pgengamma starts leaking precision before we hit the values you've given in your example, and we should switch over to the approximation at an earlier critical value.

Below is an extremely naive implementation, but shows the sort of behaviour we should expect when allowing this escape hatch:

`library(flexsurv) pgengamma_flexsurv <- pgengamma pgengamma_catch <- function(q, mu, sigma, Q){ Qwqq_critical_low <- log(.Machine$double.xmin)

y <- log(q) w <- (y - mu) / sigma qq <- 1 / (Q * Q)

Qwqq <- Q * w + log(qq)

o <- numeric(length(q))

original <- Qwqq >= Qwqq_critical_low if (any(original)){ o[original] <- pgengamma_flexsurv(q, mu, sigma, Q)[original] } if (any(!original)){ o[!original & Q > 0] <- exp(Qwqq qq - lgamma(1 + qq))[!original & Q > 0] o[!original & Q < 0] <- 1 - exp(Qwqq qq - lgamma(1 + qq))[!original & Q < 0] } o } pgengamma_flexsurv(c(56.58, 56.59), mu=0.7, sigma=exp(-1.5707), Q=-45.9621, log = FALSE) pgengamma_catch(c(56.58, 56.59), mu=0.7, sigma=exp(-1.5707), Q=-45.9621)

times <- seq(0,500,1) plot(times, pgengamma_flexsurv(times, mu=0.7, sigma=exp(-1.5707), Q=-45.9621), type = "l") lines(times, pgengamma_catch(times, mu=0.7, sigma=exp(-1.5707), Q=-45.9621), col = "red")`