therneau / survival

Survival package for R
390 stars 106 forks source link

Linear predictor for Cox model fails for `NA` in stratification variable #180

Closed hfrick closed 2 years ago

hfrick commented 2 years ago

Without centering the linear predictor (reference = "zero"), a missing value in a stratification variable should not matter for prediction of type "lp". This generally works well but I've run into what seems to be an edge case:

library(survival)

new_data <- lung[1:4,]
new_data$sex[1] <- NA

# these work
mod <- coxph(Surv(time, status) ~ age + strata(sex), data = lung)
predict(mod, new_data, type = "lp", reference = "zero")
#>         1         2         3         4 
#> 1.1998842 1.1025963 0.9080205 0.9242352

mod <- coxph(Surv(time, status) ~ age + strata(ph.ecog, sex), data = lung)
predict(mod, new_data, type = "lp", reference = "zero")
#>         1         2         3         4 
#> 0.6198432 0.5695856 0.4690705 0.4774468

# with two strata terms, it errors
mod <- coxph(Surv(time, status) ~ age + strata(ph.ecog) + strata(sex), data = lung)
predict(mod, new_data, type = "lp", reference = "zero")
#> Error in temp.lev[[strat.term$vars]] <- NULL: replacement has length zero

Created on 2022-02-01 by the reprex package (v2.0.1)

therneau commented 2 years ago

I had a double [[ subscript in one place where is should have been single []. What is interesting is that the (incorrect) code still worked if there was only one strata. Certainly not a bug I would have found on my own. Thanks. Now fixed in my master copy.