chjackson / flexsurv

The flexsurv R package for flexible parametric survival and multi-state modelling
http://chjackson.github.io/flexsurv/
53 stars 28 forks source link

`summary.flexsurvreg(type = "hazard")` breaks for `NA` in `newdata` #120

Closed hfrick closed 2 years ago

hfrick commented 2 years ago

I'm working with the current devel version of flexsurv and have encountered an error with summary.flexsurvreg(type = "hazard") when the newdata contains NA. The error occurs in check.weibull() which does not expect NA in either the shape or scale parameters (and thus the if() statement fails).

https://github.com/chjackson/flexsurv-dev/blob/1a7795ec0b670cff9b8b158d9a47084403e2a403/R/Weibull.R#L31-L36

library(flexsurv)
#> Loading required package: survival

lung_train <- lung[-c(1:5), ]
lung_test <- lung[13:15, ]

# contains NA
lung_test$ph.ecog
#> [1]  1 NA  1

mod <- flexsurvreg(Surv(time, status) ~ ph.ecog, data = lung_train,
                   dist = "weibull")

# summary() errors for hazard
summary(mod, newdata = lung_test, type = "hazard",
        t = c(100, 500, 1000), ci = FALSE)
#> Error in if (any(scale < 0)) {: missing value where TRUE/FALSE needed

Created on 2022-02-15 by the reprex package (v2.0.1)

chjackson commented 2 years ago

Thanks - I think this is now fixed in https://github.com/chjackson/flexsurv-dev/commit/bf7a19bd2a91e426fca707858768786492f2d3e5

hfrick commented 2 years ago

Thanks so much! 🙌

mattwarkentin commented 2 years ago

@chjackson I think there is still an issue with missing data for predictions of type = "rmst":

library(flexsurv)
#> Loading required package: survival

fitw <- flexsurvreg(Surv(futime, fustat) ~ age, data = ovarian, dist = "weibull")

ovarian_miss <- ovarian
ovarian_miss$age[[5]] <- NA

summary(fitw, ovarian_miss, type = 'rmst', t = 500)
#> Error in integrate(fn, start[i], t[i]): non-finite function value
mattwarkentin commented 2 years ago

More generally, it seems like the same issue arises for other distributions that have check.{dist} functions (e.g. check.exp()).

The post above reveals an issue with missing data being passed on to integrate() here: https://github.com/chjackson/flexsurv-dev/blob/bf7a19bd2a91e426fca707858768786492f2d3e5/R/utils.R#L158

chjackson commented 2 years ago

Have a go with https://github.com/chjackson/flexsurv-dev/commit/01a79b31e42d0b5fe24a8c81d5ff314ffdffbbe9 - I think I've solved the mentioned problems. Let me know if any others - haven't thoroughly swept all uses of summary functions for NA handling.