therneau / survival

Survival package for R
381 stars 104 forks source link

edit error message in `[.survfit`? #259

Closed ThomasSoeiro closed 3 months ago

ThomasSoeiro commented 3 months ago

Hello,

I expected that survfit(*)[1:2] would return the same as unclass(survfit(*))[1:2]. Eventually, I understood that the [.survfit method works on indexes/names of the strata (i.e. the RHS of the formula passed to survfit()) and not on indexes/names of the list returned by survfit().

So maybe we could:

I can propose a patch if you are interested.

Thanks a lot for your work on survival!

library(survival)
fit <- survfit(Surv(time, status) ~ x, aml)

identical(
  unclass(fit[1:2]),
  unclass(fit)[1:2]
)
# [1] FALSE

identical(
  fit[c("x=Maintained", "x=Nonmaintained")],
  fit[1:2]
)
# [1] TRUE

fit[c("n", "time", "n.risk")]
# Error in `[.survfit`(fit, c("n", "time", "n.risk")) : 
#   strata n time n.risk n.event not matched

# -> proposed changed
# Error in `[.survfit`(fit, c("n", "time", "n.risk")) : 
#   strata 'n', 'time', 'n.risk', 'n.event' not matched
#   strata names or indexes are expected

fit[1:3]
# Error in `[.survfit`(fit, 1:4) : strata 3 4 not matched

# -> proposed changed
# Error in `[.survfit`(fit, 1:3) :
#   strata 3, 4 not matched
#   strata names or indexes are expected
ThomasSoeiro commented 3 months ago

It is documented ?survfit.object:

Survfit objects can be subscripted. This is often used to plot a subset of the curves, for instance. From the user's point of view the survfit object appears to be a vector, matrix, or array of curves. The first dimension is always the underlying number of curves or “strata”; for multi-state models the state is always the last dimension. Predicted curves from a Cox model can have a second dimension which is the number of different covariate prediction vectors.

So the only question is whether it is relevant to edit the error message in [.survfit. If yes, I can propose a patch. If no, feel free to close without further comments.

Thanks!

therneau commented 3 months ago

Although each curve is a complex object, it is useful to think of the result of survfit as a vector, matrix or array of curves. This makes it easy to print or plot a subset of them. It is a trick, yes, but useful for the user. Also see dim(fit).