chjackson / flexsurv

The flexsurv R package for flexible parametric survival and multi-state modelling
http://chjackson.github.io/flexsurv/
53 stars 28 forks source link

Prediction after manually editing coefficient values #129

Closed vincentarelbundock closed 2 years ago

vincentarelbundock commented 2 years ago

Hi,

Thanks for your work on this package.

I am the maintainer of marginaleffects, a package which can be used to compute adjusted predictions, contrasts, and slopes for 62+ model classes in R. I would like to add support for flexsurv objects. In order to compute standard errors around those quantities I need to do something weird and specific:

  1. Copy the model object
  2. Modify the value of the coefficients held internally in the new model object
  3. Make predictions by calling predict() on the new object. Those predictions should be different than those made using the original object.

I tried doing this:

require("flexsurv")

mod1 <- flexsurvreg(Surv(recyrs, censrec) ~ group, data = bc, dist = "weibull")

mod2 <- mod1

mod2$coefficients <- setNames(rep(0, length(mod$coefficients)),
                              names(mod$coefficients))

Coefficients have been altered:

coef(mod1)
#>       shape       scale groupMedium   groupPoor 
#>   0.3218311   2.4356168  -0.6135892  -1.2122137

coef(mod2)
#>       shape       scale groupMedium   groupPoor 
#>           0           0           0           0

But predictions remain the same:

predict(mod1, newdata = head(bc, 2))
#> # A tibble: 2 × 1
#>   .pred
#>   <dbl>
#> 1  10.4
#> 2  10.4

predict(mod2, newdata = head(bc, 2))
#> # A tibble: 2 × 1
#>   .pred
#>   <dbl>
#> 1  10.4
#> 2  10.4

Do you have any ideas for how to proceed? Is this possible at all?

Thanks!

chjackson commented 2 years ago

predict.flexsurvreg wraps summary.flexsurvreg, which in turn reads the estimated parameters from the $res.t component of the fitted model object. (actually, covariate effects are taken from the $res component, which contains the parameters on the natural scale rather than log-transformed scale, but these are identical to those in $res.t for covariate effects, which are not log-transformed during estimation). The $coefficients component is just provided as a shortcut for users who expect parameter estimates to be found there. So overwriting them won't change any other summaries of the model.

I don't know if it safe to assume that your procedure will do what you expect for every modelling package. There are so many de facto, undocumented standards and conventions in R modelling that not everyone will obey! But it is nice to have packages like yours, tidymodels and the like, that try to standardise interfaces.

Your procedure for getting SEs sounds similar to what normboot.flexsurvreg function does. Also there is a function standsurv.flexsurvreg in the devel version for standardised survival and hazards - I don't know if that is of any use to you.

vincentarelbundock commented 2 years ago

This is super helpful, thanks!

I don't know if it safe to assume that your procedure will do what you expect for every modelling package. There are so many de facto, undocumented standards and conventions in R modelling that not everyone will obey!

As you can imagine, this has a been quite a challenge. I've had to write many custom methods to handle corner cases.

I try not to "assume" that the procedure works, and instead test my numerical results against alternative software: Stata, as well as the emmeans and margins packages for R. Unfortunately, this is not not always possible, but a good proportion of models are covered by the test suite: https://vincentarelbundock.github.io/marginaleffects/articles/mfx06_supported_models.html#marginal-effects-and-contrasts