chjackson / flexsurv

The flexsurv R package for flexible parametric survival and multi-state modelling
http://chjackson.github.io/flexsurv/
54 stars 28 forks source link

missing values in covariates #56

Closed topepo closed 3 years ago

topepo commented 5 years ago

This line in form.model.matrix automatically drops any rows with missing data.

It would be nice to have the ability to get the same number of rows out of the summary function's predictions are were in newdata (otherwise you have to track down the missing rows and pad the predicted values).

One idea is to just use na.pass and let the predicted values have NA. Otherwise, having na.action as an option that gets passed along to model.frame would allow the user to make their own choice (but I would vote for na.pass).

> library(sessioninfo)
> library(flexsurv)
Loading required package: survival
> ovarian_miss <- ovarian
> ovarian_miss$age[1] <- NA
> 
> fitw <-
+   flexsurvreg(formula = Surv(futime, fustat) ~ age,
+               data = ovarian_miss[-(1:5), ],
+               dist = "weibull")
> predictions <- 
+   summary(fitw, newdata = ovarian_miss[1:5, "age", drop = FALSE], t = 1000)
> 
> length(predictions)
[1] 4
> session_info()
─ Session info ──────────────────────────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 3.5.0 (2018-04-23)
 os       macOS High Sierra 10.13.6   
 system   x86_64, darwin15.6.0        
 ui       RStudio                     
 language (EN)                        
 collate  en_US.UTF-8                 
 tz       America/New_York            
 date     2018-09-19                  

─ Packages ──────────────────────────────────────────────────────────────────────────────────────
 package      * version date       source         
 clisymbols     1.2.0   2017-05-21 CRAN (R 3.5.0) 
 deSolve        1.21    2018-05-09 cran (@1.21)   
 flexsurv     * 1.1     2017-03-27 CRAN (R 3.5.0) 
 lattice        0.20-35 2017-03-25 CRAN (R 3.5.0) 
 Matrix         1.2-14  2018-04-09 CRAN (R 3.5.0) 
 mstate         0.2.11  2018-04-09 CRAN (R 3.5.0) 
 muhaz          1.2.6   2014-08-09 CRAN (R 3.5.0) 
 mvtnorm        1.0-8   2018-05-31 cran (@1.0-8)  
 quadprog       1.5-5   2013-04-17 CRAN (R 3.5.0) 
 RColorBrewer   1.1-2   2014-12-07 CRAN (R 3.5.0) 
 Rcpp           0.12.18 2018-07-23 cran (@0.12.18)
 sessioninfo  * 1.0.0   2017-06-21 CRAN (R 3.5.0) 
 survival     * 2.42-3  2018-04-16 CRAN (R 3.5.0) 
 withr          2.1.2   2018-03-15 CRAN (R 3.5.0) 
 yaml           2.2.0   2018-07-25 cran (@2.2.0)  
chjackson commented 4 years ago

Thanks - just implemented the na.action argument to summary.flexsurvreg in https://github.com/chjackson/flexsurv-dev/commit/1193922396d0f53c0dd42ca716656f8404acdf22.

topepo commented 4 years ago

Any chance that you would change the default to na.pass()? That would pad the outcome with missing values. Otherwise, the input data has n rows and the output has < n rows (but no easy way to figure out how to merge them).

chjackson commented 4 years ago

I thought I did this?

topepo commented 4 years ago

The docs have

##' @param na.action a missing-data filter function, applied after any 'subset'
##' argument has been used. Default is \code{options()$na.action}.
chjackson commented 4 years ago

That's the doc for flexsurvreg, not summary.flexsurvreg