chjackson / flexsurv

The flexsurv R package for flexible parametric survival and multi-state modelling
http://chjackson.github.io/flexsurv/
54 stars 28 forks source link

added S3 methods for predict, residuals, and broom funs #77

Closed mattwarkentin closed 4 years ago

mattwarkentin commented 4 years ago

Some additional thoughts...

Dependencies: I added the following dependencies: assertthat, generics, purrr, tibble, and tidyr. tibble is necessary for the list-columns, and generics is necessary as a light-weight import of the broom generics (i.e. tidy(), glance(), augment()).

assertthat can basically be replaced by stopifnot(), so that dependency can be removed, but it has a very very low dependency burden, by design. I prefer assertthat for better and more flexible error messaging, but I am not married to it.

tidyr and purrr are only used once each, I think, so we could probably find workarounds for that to remove those deps, if desired. They both depend on dplyr et al., so it does add a bit of a heavy dependency tree. I fought against using purrr but I ended up using it for something I couldn't figure out how to do with base functions (see here).

R CMD check:

── R CMD check results ─────────────────────────────────────── flexsurv 1.2 ────
Duration: 2m 15.5s

> checking compilation flags used ... WARNING
  Compilation used the following non-portable flag(s):
    ‘-Wconversion’ ‘-Wno-sign-conversion’
  including flag(s) suppressing warnings

0 errors ✓ | 1 warning x | 0 notes ✓
Error: R CMD check found WARNINGs
Execution halted

Exited with status 1.

Eventual TODOs

I look forward to hearing your thoughts. It has been fun to try and contribute to a mature package. Thanks for all of your guidance so far.

mattwarkentin commented 4 years ago

Related to #75

mattwarkentin commented 4 years ago

@topepo @simonpcouch, if you want to review the broom-funs.R file for adherence to existing and future broom guidelines

simonpcouch commented 4 years ago

Looks great! Generally, for new tidy() methods, we try to maintain the argument order x, conf.int, conf.level, {other arguments}, ....

Also, I mistakenly wrote that conf.int defaults to TRUE in #75... I meant to say, in most all broom::tidy.*() methods except for those with backward compatibility issues, it defaults to FALSE. I've corrected my mistake there.🙂

mattwarkentin commented 4 years ago

Haha, I thought I was losing my mind! I went back to double-check your previous comment. I was sure I read that it should default to TRUE. No worries, I will flip it back to FALSE as a default.

mattwarkentin commented 4 years ago

fixed minor tidy issues with 495f6a2

chjackson commented 4 years ago

Thanks for the PR, Matt! Looks great on a flick through. Might not get time for a detailed look for a few days.

Could someone explain the compatibility issues with defaulting to conf.int = TRUE? I'd like to encourage the use of interval estimates in preference to standard errors to indicate uncertainty around estimates. Particularly for parameters that are not on a real-line scale the +- 2 x standard error heuristic that people use can go wrong.

chjackson commented 4 years ago

Looks really nice! Few things I spotted:

The rename_tidy function didn't work in pre R 4.0.0 because the names_map data frame wasn't defined with stringsAsFactors=FALSE, and that broke a couple of things.

In predict.flexsurvreg, why are the standard errors and confidence limits coupled to the estimates in a list column, rather than simply being added as separate data frame columns? That seems unnecessarily complicated, and I don't think it's mandated by the tidymodels guidelines?

Can there be a bit more help for users for dealing with list columns cleanly? e.g. in tidyr::unnest(predict(model, type="quantile", p=c(0.25, 0.75)), cols=".pred") doesn't show which row of the original data each row of the unnested output corresponds to. People may also want a wide data frame rather than the long one that unnest produces.

I think augment should default to using model.frame(x). Unlike for predict, the broom docs seem to encourage this: "augment() should try to reconstruct the original data as much as possible from the model object..."

It's fine to add those dependencies on tidyverse packages. flexsurv 1.1.1 already depended on tidyr and tibble.

mattwarkentin commented 4 years ago

Thanks for the feedback, Chris.

The rename_tidy function didn't work in pre R 4.0.0 because the names_map data frame wasn't defined with stringsAsFactors=FALSE, and that broke a couple of things.

I'll fix that function to be more stable. Perhaps I will just switch from data.frame to tibble for more guaranteed stability, since we already have tibble imported.

In predict.flexsurvreg, why are the standard errors and confidence limits coupled to the estimates in a list column, rather than simply being added as separate data frame columns? That seems unnecessarily complicated, and I don't think it's mandated by the tidymodels guidelines?

Hmm, good question. I guess I just made a decision and went with it. But you're right, perhaps standard errors and confidence intervals should be separate columns, and we only use list columns for multiple-row predictions. I guess my thoughts were that we want the expected output of predict.flexsurvreg to be intuitive and predictable for the user. If we go this route, there are three possible outputs depending on input args: a numeric vector, a multi-column table, or a list-column of tables. Is that too much cognitive load for the user to be able to predict what the function is likely to return?

Can there be a bit more help for users for dealing with list columns cleanly? e.g. in tidyr::unnest(predict(model, type="quantile", p=c(0.25, 0.75)), cols=".pred") doesn't show which row of the original data each row of the unnested output corresponds to. People may also want a wide data frame rather than the long one that unnest produces.

Yes, definitely. I was planning on writing a vignette once these functions were approved by you. I can certainly beef up the examples section in the help docs also. I think best-practices would be using predict to add columns to data frames. Example:

library(tidyverse)
library(flexsurv)
#> Loading required package: survival

fit <- flexsurvspline(Surv(futime, fustat) ~ age, data = ovarian, k = 1)

# single prediction as new column
ovarian %>% 
  as_tibble() %>% 
  mutate(mean_surv = predict(fit))
#> # A tibble: 26 x 7
#>    futime fustat   age resid.ds    rx ecog.ps mean_surv
#>     <dbl>  <dbl> <dbl>    <dbl> <dbl>   <dbl>     <dbl>
#>  1     59      1  72.3        2     1       1      208.
#>  2    115      1  74.5        2     1       1      174.
#>  3    156      1  66.5        2     1       2      359.
#>  4    421      0  53.4        2     2       1     1425.
#>  5    431      1  50.3        2     1       1     1979.
#>  6    448      0  56.4        1     1       2     1023.
#>  7    464      1  56.9        2     2       2      969.
#>  8    475      1  59.9        2     2       2      709.
#>  9    477      0  64.2        2     1       1      452.
#> 10    563      1  55.2        1     2       2     1171.
#> # … with 16 more rows

# multi-time prediction
ovarian %>% 
  as_tibble() %>% 
  mutate(predict(fit, type = "survival", times = c(600, 800)))
#> # A tibble: 26 x 7
#>    futime fustat   age resid.ds    rx ecog.ps .pred           
#>     <dbl>  <dbl> <dbl>    <dbl> <dbl>   <dbl> <list>          
#>  1     59      1  72.3        2     1       1 <df[,2] [2 × 2]>
#>  2    115      1  74.5        2     1       1 <df[,2] [2 × 2]>
#>  3    156      1  66.5        2     1       2 <df[,2] [2 × 2]>
#>  4    421      0  53.4        2     2       1 <df[,2] [2 × 2]>
#>  5    431      1  50.3        2     1       1 <df[,2] [2 × 2]>
#>  6    448      0  56.4        1     1       2 <df[,2] [2 × 2]>
#>  7    464      1  56.9        2     2       2 <df[,2] [2 × 2]>
#>  8    475      1  59.9        2     2       2 <df[,2] [2 × 2]>
#>  9    477      0  64.2        2     1       1 <df[,2] [2 × 2]>
#> 10    563      1  55.2        1     2       2 <df[,2] [2 × 2]>
#> # … with 16 more rows

# same as above, unnested
ovarian %>% 
  as_tibble() %>% 
  mutate(predict(fit, type = "survival", times = c(600, 800))) %>% 
  unnest(.pred)
#> # A tibble: 52 x 8
#>    futime fustat   age resid.ds    rx ecog.ps .time     .pred
#>     <dbl>  <dbl> <dbl>    <dbl> <dbl>   <dbl> <dbl>     <dbl>
#>  1     59      1  72.3        2     1       1   600 0.00805  
#>  2     59      1  72.3        2     1       1   800 0.000709 
#>  3    115      1  74.5        2     1       1   600 0.00119  
#>  4    115      1  74.5        2     1       1   800 0.0000399
#>  5    156      1  66.5        2     1       2   600 0.143    
#>  6    156      1  66.5        2     1       2   800 0.0534   
#>  7    421      0  53.4        2     2       1   600 0.773    
#>  8    421      0  53.4        2     2       1   800 0.679    
#>  9    431      1  50.3        2     1       1   600 0.851    
#> 10    431      1  50.3        2     1       1   800 0.785    
#> # … with 42 more rows

I think augment should default to using model.frame(x). Unlike for predict, the broom docs seem to encourage this: "augment() should try to reconstruct the original data as much as possible from the model object..."

This can be done for sure. I went back and forth on this. I noted the augment() documentation for coxph class objects indicates it defaults to model.frame(x), but I found that didn't actually work (maybe @simonpcouch can confirm):

cox_fit <- coxph(Surv(futime, fustat) ~ age, data = ovarian)
augment(cox_fit)
Error: Must specify either `data` or `newdata` argument.

But certainly its an easy fix to make it default to use model.frame(x)

It's fine to add those dependencies on tidyverse packages. flexsurv 1.1.1 already depended on tidyr and tibble.

Okay, great!

chjackson commented 4 years ago

Thanks - yes - the tidymodels doc about predict methods implies that confidence limits and standard errors should be in separate columns, and only mentions list columns for non-scalar predictions (e.g. as you do for quantiles and functions of time). It also suggests the return should always be a tibble even if there's just one output and base R predict methods might return a vector - I don't really mind what is done in that case, but this guideline seems reasonable.

mattwarkentin commented 4 years ago

How do you feel about the fact that model.frame(x) doesn't really return the input data in the same form? This is mostly pertinent to augment.flexsurvreg() as it is supposed to return the "original" data along with added predictions/residuals.

See below:

library(flexsurv)
#> Loading required package: survival
library(broom)

fs_fit <- flexsurvreg(Surv(futime, fustat) ~ age, data = ovarian, dist = "exp")
model.frame(fs_fit)
#>    Surv(futime, fustat)     age (weights)
#> 1                    59 72.3315         1
#> 2                   115 74.4932         1
#> 3                   156 66.4658         1
#> 4                  421+ 53.3644         1
#> 5                   431 50.3397         1
#> 6                  448+ 56.4301         1
#> 7                   464 56.9370         1
#> 8                   475 59.8548         1
#> 9                  477+ 64.1753         1
#> 10                  563 55.1781         1

as_tibble(model.frame(fs_fit))
# A tibble: 26 x 3
   `Surv(futime, fustat)`[,"time"] [,"status"]   age `(weights)`
                             <dbl>       <dbl> <dbl>       <dbl>
 1                              59           1  72.3           1
 2                             115           1  74.5           1
 3                             156           1  66.5           1
 4                             421           0  53.4           1
 5                             431           1  50.3           1
 6                             448           0  56.4           1
 7                             464           1  56.9           1
 8                             475           1  59.9           1
 9                             477           0  64.2           1
10                             563           1  55.2           1
# … with 16 more rows

augment(fs_fit)
# A tibble: 26 x 6
   `Surv(futime, f… [,"status"]   age `(weights)` .fitted
              <dbl>       <dbl> <dbl>       <dbl>   <dbl>
 1               59           1  72.3           1    213.
 2              115           1  74.5           1    165.
 3              156           1  66.5           1    427.
 4              421           0  53.4           1   2016.
 5              431           1  50.3           1   2886.
 6              448           0  56.4           1   1402.
 7              464           1  56.9           1   1320.
 8              475           1  59.9           1    934.
 9              477           0  64.2           1    560.
10              563           1  55.2           1   1626.
# … with 16 more rows, and 2 more variables: .se.fit <dbl>,
#   .resid <dbl>

Should we fix the names of the first two columns by parsing the Surv() call, and drop the (weights) column?

mattwarkentin commented 4 years ago

Maybe this is why augment.coxph actually requires data or newdata to be specified...

mattwarkentin commented 4 years ago

In predict.flexsurvreg, why are the standard errors and confidence limits coupled to the estimates in a list column, rather than simply being added as separate data frame columns? That seems unnecessarily complicated, and I don't think it's mandated by the tidymodels guidelines?

Fixed in d4c1d81

chjackson commented 4 years ago

For augment, ideally model.frame.flexsurvreg would return the variables in the Surv object as they were in the original data, but they'd be fiddly to extract and reproduce exactly, as they look a bit different for different kinds of censoring. I don't think it's a problem to just have the Surv object - like model.frame.survreg and model.frame.coxph do, because that allows you to do the same kind of survival modelling you did with the original data. We can at least document that people should supply the original data if they need it. Also I don't think there's any harm from keeping extra columns like "(weights)" in, they might be helpful for some.

mattwarkentin commented 4 years ago

but they'd be fiddly to extract and reproduce exactly, as they look a bit different for different kinds of censoring

My thoughts exactly. Too many possible usages of Surv() to parse it safely.

Okay, I will move forward with this implementation.

mattwarkentin commented 4 years ago

70e8808 supports model.frame(x) for augment.flexsurvreg. I added a note to the help docs to explain that users should specify data if they want the exact original data returned.

I also added some bullet points to the inst/NEWS file for v1.2, you may wish to review.

chjackson commented 4 years ago

Great - thanks - all merged in now