sambrilleman / simsurv

Simulate Survival Data
GNU General Public License v3.0
22 stars 8 forks source link

"interval" argument support for a lower limit of zero #4

Closed ellessenne closed 6 years ago

ellessenne commented 6 years ago

Hi Sam! I modified the checks on the interval argument of simsum to accept a lower limit of zero: I did so as (sometimes) I find it easier to just set interval = c(0, 500) than to fiddle around until I find a value that uniroot() likes. Happy to discuss this further though! Cheers,

Alessandro

sambrilleman commented 6 years ago

Cheers Alessandro! I'm curious at to why I didn't used 0 in the original version. My guess is that there must have been some odd behaviour on the boundary (perhaps if the hazard = -Inf or +Inf at time 0?). After looking through the code I can't find why a lower limit of zero wouldn't work. Did you find interval = c(0, 500) to behave better than interval = c(1E-8, 500)?

I'll run the simulation tests this arvo and check it doesn't crash. And if so, will merge in...

ellessenne commented 6 years ago

Thanks Sam! Yeah, I had the same problem when coding simulating from a Weibull-Weibull hazard myself and I solved it by allowing uniroot() to search between 0 and a large number! I think it is caused as you mention by hazard functions that tend to infinity at time t = 0! For instance, I encountered the problem when simulating a model with a random intercept and a mixture Weibull baseline hazard:

# Set seed
set.seed(22342)

# Simulate data with random effect
n_individuals = 250
n_clusters = 15
fv = 1.25
X <- data.frame(id = 1:(n_individuals * n_clusters), grpid = rep(1:n_clusters, 
  each = n_individuals), trt = stats::rbinom(n_individuals * n_clusters, 1, 
  0.5))
X[["frvec"]] <- rep(stats::rnorm(n_clusters, mean = 0, sd = sqrt(fv)), each = n_individuals)

# Not ok:
S <- simsurv::simsurv(dist = "weibull", mixture = TRUE, lambdas = c(0.5, 1.5), 
  gammas = c(0.03, 0.9), pmix = 0.7, x = X, betas = c(trt = -0.5, frvec = 1), 
  maxt = 5)
#> Error in stats::uniroot(rootfn_surv, survival = survival, x = x_i, betas = betas_i, : f() values at end points not of opposite sign

This mixture Weibull hazard is large early on (small t), decreasing thereafter.

For the moment, I solved it by setting the minimum of the interval to the smallest number larger than zero that the machine can use:

# Ok:
S <- simsurv::simsurv(dist = "weibull", mixture = TRUE, lambdas = c(0.5, 1.5), 
  gammas = c(0.03, 0.9), pmix = 0.7, x = X, betas = c(trt = -0.5, frvec = 1), 
  maxt = 5, interval = c(.Machine$double.xmin, 500))

# Smallest number my machine can use:
.Machine$double.xmin
#> [1] 2.225074e-308

After that, I manually "trim" values smaller than a day (1 / 365.242) in order to reduce numerical problems caused by survival times that are too small. I hope this helps!