hputter / mstate

https://hputter.github.io/mstate/
7 stars 5 forks source link

`msfit`: Error when `coxph(...,x=TRUE)` + `strata()` term present #20

Closed MetzgerSK closed 1 year ago

MetzgerSK commented 1 year ago

R: 4.1.3 64-bit, mstate: 0.3.2, survival: 3.3.1, Win11

Problem

msfit will throw an error if coxph has x=TRUE, but the model has a strata() term:

Error in order(mf[, strat], y[, 2], -y[, 3]) : argument 1 is not a vector

The problem arises in an otherwise vanilla situation: the model has no weights or offset, and coxph(..., y=TRUE) (the default).

Problem's Source

The mf object isn't created in this situation. msfit references mf in line 206 (and 211, for type=="right" instead of type=="counting"):

ord <- order(mf[, strat], y[, 2], -y[, 3])

The problem originates in msfit's lines 20–25:

if (is.null(object$y) || is.null(object[["x"]]) || !is.null(object$call$weights) ||
    (has.strata && is.null(object$strata)) || !is.null(attr(object$terms,
    "offset"))) {
    mf <- model.frame(object)
}
else mf <- NULL

With the coxph default (coxph(..., x=FALSE)), is.null(object[["x"]]) usually evaluates to TRUE. The if gets triggered, regardless of the truth of the other four "or" statements in the if()'s conditional. mf is generated without issue.

However, if coxph(..., x=TRUE), is.null(object[["x"]]) evaluates to FALSE. The other four "or"ed terms in the if() also evaluate to FALSE, in this situation: y=TRUE by default for coxph, and the model has no weights or offset. The else gets triggered instead of the if, and mf's never generated. Instead, its value is set to NULL.

mf's contents only get referenced post-line 25 by msfit if there's a strata() term (has.strata==TRUE), hence the issue only arising in this context.

MWE

library(mstate)
library(tidyverse)

# Load demo data
dat <- read_dta("http://www.stata-press.com/data/r17/catheter.dta")
dat <- dat %>%
         mutate(start = 0,
                stop = time,
                status = infect,
                from = 1,
                to = ifelse(female==0, 2, 3))

    ## Trans mat
    tmat <- trans.comprisk(2, c("At Risk",
                                "Infection (Male)", "Infection (Female)"))

    ## Set mstate attributes
    attr(dat, "trans") <- tmat
    class(dat) <- c("msdata", "data.frame")

# Gen covariate profile
cov.prf <- data.frame(age=25) %>% uncount(2) %>% mutate(strata=row_number()-1)

# ATT 1: coxph default (x=FALSE) - msfit works fine
mod <- coxph(Surv(start,stop,status) ~ age + strata(female), data=dat)
msf <- msfit(mod, cov.prf, trans=tmat)

# ATT 2: x=TRUE - msfit throws error
mod2 <- coxph(Surv(start,stop,status) ~ age + strata(female), data=dat, x=TRUE)
msf2 <- msfit(mod2, cov.prf, trans=tmat)
edbonneville commented 1 year ago

Thanks for spotting this - indeed x = TRUE in the coxph() call should not have any impact on later msfit() calls.. for now the (unelegant!) solution in 09f421f is to simply set object[["x"]] to NULL in cases where x = TRUE.