therneau / survival

Survival package for R
381 stars 104 forks source link

Set `SIMPLIFY = FALSE` in `mapply()` calls in `survfit.resid()` functions #238

Closed DavidJesse21 closed 4 months ago

DavidJesse21 commented 8 months ago

Hi,

so I found a hack to circumvent #237 at the moment. My motivation for using survival::pseudo() was to use them to obtain bootstrap estimates of the covariance matrix of the resulting regression model. While programming a function that would do that, I encountered another issue, which in this case is more precisely located at rsurvpart1(). I will first provide a code snippet, which illustrates/reproduces this issue:

library(survival)

df = survival::veteran[, c("time", "status", "trt")]
df$trt = df$trt - 1L

# Bootstrapping
set.seed(42)
boot_idx = t(vapply(
  seq_len(100),
  \(i) sample(1:nrow(df), replace = TRUE),
  numeric(nrow(df))
))

# When/where does the error happen? (i.e. for which bootstrap sample)
for (i in 1:nrow(boot_idx)) {
  tryCatch(
    {
      idx = boot_idx[i, ]
      fit = survfit(Surv(time, status) ~ trt, data = df[idx, ], se.fit = FALSE)
      drop(pseudo(fit, type = "rmst", times = 200))
    },
    error = \(e) cat(i, "\n")
  )
}
# 21, 47 (fail)

df[boot_idx[21, ], ]
fit = survfit(Surv(time, status) ~ trt, data = df[boot_idx[21, ], ], se.fit = FALSE)
drop(pseudo(fit, type = "rmst", times = 200))
# Error in do.call(rbind, wtmat) : second argument must be a list

So I assume this piece of code causes the error:

wtmat <- mapply(function(x, y) pmin(outer(x, 
  -y, "+"), 0), auc1, auc2)

I don't know what the expected output of this should usually look like (in terms of dimensions etc.) but I assume inside of mapply() we should set SIMPLIFY = FALSE such that the output will be a list in any case. This might also be the case for other lines/pieces of code that use this function.

Just as a note: I don't know in which particular cases these errors happen and when they don't. Initially I did not find anything suspicious about the bootstrap sample that caused this error.

Best regards David

therneau commented 5 months ago

The survfit and residuals.survfit functions are in the midst of a major rewrite. (An issue with case weights, which was unfortunately in the interior of the underlying C code, i.e., the hardest to fix). One side effect is that the mapply is no more. Hopefully fully debugged and posted soon.

therneau commented 4 months ago

Number 237 has been repaired in the most recent update