chjackson / flexsurv

The flexsurv R package for flexible parametric survival and multi-state modelling
http://chjackson.github.io/flexsurv/
55 stars 27 forks source link

Ordering of time output in standsurv #177

Closed irtimmins closed 8 months ago

irtimmins commented 8 months ago

Hi, standsurv appears to reorder output to ascending order of time, see example below. It may be preferable to have the order consistent with the t argument input?

# Using bc dataset
test_mod <- flexsurvspline(Surv(recyrs, censrec) ~ 1, data=bc, k=5)

# these produce same output
standsurv(test_mod, t=c(1,2,3,4), type = "haz")
standsurv(test_mod, t=c(3,1,4,2), type = "haz")

# as do these
standsurv(test_mod, t=c(1,2,3,4), type = "surv")
standsurv(test_mod, t=c(3,1,4,2), type = "surv")

###### and these
standsurv(test_mod, t=c(1,2,3,4), type = "rmst")
standsurv(test_mod, t=c(3,1,4,2), type = "rmst")
chjackson commented 8 months ago

I would tend to agree, at least for consistency with what summary.flexsurvreg does. @mikesweeting what do you think - was this deliberate?

mikesweeting commented 8 months ago

This is not intentional, but appears to be a feature of group_by() and summarise() in dplyr, see https://github.com/tidyverse/dplyr/issues/2159. Would need a bit of a hack to fix this. However, I can see the argument for consistency with summary.flexsurvreg. @chjackson what do you suggest?

irtimmins commented 8 months ago

Another problem here is that for the bootstrap method (but not the delta method) the confidence intervals/standard errors are kept in the same order, but the at1 is in ascending order.

# Using bc dataset
test_mod <- flexsurvspline(Surv(recyrs, censrec) ~ 1, data=bc, k=5)

standsurv(test_mod, t=c(1,5), type = "rmst", se = TRUE, ci = TRUE, boot = T, B = 1000)
Calculating bootstrap standard errors / confidence intervals
# A tibble: 2 × 5
   time   at1  at1_se at1_lci at1_uci
  <dbl> <dbl>   <dbl>   <dbl>   <dbl>
1     1 0.975 0.00403   0.967   0.982
2     5 3.61  0.0496    3.52    3.71

standsurv(test_mod, t=c(5,1), type = "rmst", se = TRUE, ci = TRUE, boot = T, B = 1000)
Calculating bootstrap standard errors / confidence intervals
# A tibble: 2 × 5
   time   at1  at1_se at1_lci at1_uci
  <dbl> <dbl>   <dbl>   <dbl>   <dbl>
1     1 0.975 0.0516    3.51    3.71 
2     5 3.61  0.00424   0.965   0.982
mikesweeting commented 8 months ago

Thank you @irtimmins for reporting. This is a more serious issue and I'll look into it further

mikesweeting commented 8 months ago

I propose adding a check to ensure that argument t is sorted in ascending order, and that it is unique (neither checks are currently done). It would return an error if this is not the case. This would prevent the issues above arising. It does however depart from how summary.flexsurvreg deals with the t argument so would be good to get your view @chjackson?

t being unique is important, as it will throw an error for SE / CI calculations hence why I'm suggesting adding this check. So this will deviate somewhat from summary.flexsurvreg anyway. The alternative is more extensive recoding.

chjackson commented 8 months ago

Ooh I'm not keen on forcing people to sort their input. It violates the software robustness principle of being liberal in what you accept. Similarly if the input times are non-unique, I don't see why the outputs can't be replicated accordingly?

The bug where the CIs don't match the estimates should be fixed at least (I'm hoping this part isn't much work!). But I'd rather keep the original behaviour round sorting (which wasn't incorrect, just inconsistent with other functions) than make the function more restrictive, which would be reducing functionality.

mikesweeting commented 8 months ago

Hi @chjackson. I've opened a pull request for this issue, please check if you're happy with this solution. The output is matched to the input for the t vector, so unordered and duplicates are kept.

chjackson commented 8 months ago

Fixed in https://github.com/chjackson/flexsurv/commit/f18ab32b2df599d846a99ce83f1f419250e9a5df and subsequent commits - thanks again Iain and Mike.