DenisRustand / INLAjoint

Joint modeling multivariate longitudinal and time-to-event outcomes with INLA
17 stars 0 forks source link

Use time-dependent covariates in the survival model #15

Closed NewUser36 closed 1 year ago

NewUser36 commented 1 year ago

Hello!

I'm trying to use INLAjoint to train a joint model for survival and longitudinal data with left-truncation and time-varying covariates.

I have a longitudinal variable that I want to model using a mixed effect model, but there are also time-varying covariates I want to use in the survival submodel, but not as longitudinal variables. I simply want to use them in the survival part as one would to with a Cox model with time-varying covariates (which are called "exogenous time-dependent covariates" in Rizopoulos' book "Joint Models for Longitudinal and Time-to-Event Data").

Here's a toy dataset to show what I want to do.

library(JMbayes2)
library(INLA) # testing branch
library(INLAjoint) # 23.04.04

data(pbc2, package="JMbayes2") # dataset

# create start-stop counting process variables
pbc <- pbc2[c("id", "serBilir", "drug", "year", "years",
              "status2", "spiders", "sex", "age", "hepatomegaly", "platelets")]
pbc$start <- pbc$year
splitID <- split(pbc[c("start", "years")], pbc$id)
pbc$stop <- unlist(lapply(splitID,
                          function (d) c(d$start[-1], d$years[1]) ))
pbc$event <- with(pbc, ave(status2, id,
                           FUN = function (x) c(rep(0, length(x)-1), x[1])))
pbc <- na.omit(pbc)
# create artificial left-truncation for every observation to replicate the real data
pbc <- pbc[pbc$start != 0, ]

str(pbc)
head(pbc)
#   id serBilir      drug      year    years status2 spiders    sex      age hepatomegaly platelets     start      stop event
# 2  1     21.3 D-penicil 0.5256817  1.09517       1     Yes female 58.76684          Yes       183 0.5256817 1.0951703     1
# 4  2      0.8 D-penicil 0.4983025 14.15234       0     Yes female 56.44782          Yes       188 0.4983025 0.9993429     0
# 5  2      1.0 D-penicil 0.9993429 14.15234       0     Yes female 56.44782          Yes       161 0.9993429 2.1027270     0
# 6  2      1.9 D-penicil 2.1027270 14.15234       0     Yes female 56.44782          Yes       122 2.1027270 4.9008871     0
# 7  2      2.6 D-penicil 4.9008871 14.15234       0     Yes female 56.44782          Yes       135 4.9008871 5.8892783     0
# 8  2      3.6 D-penicil 5.8892783 14.15234       0     Yes female 56.44782          Yes       100 5.8892783 6.8858833     0

It works using JMbayes2. Indeed, the Cox model can use time-dependent variables, serBilir and spiders in this example:

# longitudinal variable
fm2 <- mixed_model(hepatomegaly ~ sex + age + year + spiders, data = pbc,
                   random = ~ year | id, family = binomial())

# survival model
# see https://cran.r-project.org/web/packages/JMbayes/JMbayes.pdf page 31
# "# create start-stop counting process variables"
tdCox.pbc <- coxph(Surv(start, stop, event) ~ drug + spiders + serBilir + cluster(id),
                   data = pbc, x = TRUE, model = TRUE)

jointFit.pbc6 <- JMbayes2::jm(tdCox.pbc, fm2, time_var = "year")  # could take I little while to run

If I wanted to do a simple Cox model with R-INLA with delayed entries (and time-varying covariates), I could do

surv_inla <- inla.surv(truncation=pbc$start, time = pbc$stop, event = pbc$event)

pro.surv.inla <- inla(surv_inla ~drug+spiders, data = pbc,
  family = "coxph")
summary(pro.surv.inla)

# compare with survival::coxph, results are very similar.
tdCox.pbc <- coxph(Surv(start, stop, event) ~ drug + spiders,
                   data = pbc, x = TRUE, model = TRUE)
summary(tdCox.pbc)

which gives similar results to the coxph function.

Now, I want to do the same thing I did with JMbayes2, but this time with the joint() function from the INLAjoint package. I tried the following code, which is not working.

M2 <- joint(
  formSurv = surv_inla ~ drug + spiders + serBilir,
  formLong =
      hepatomegaly ~ age + year + spiders + (1+year|id),
  dataLong = pbc, 
  id = "id", 
  timeVar="year",
  assoc="CV",
  family = "binomial", 
  link="default"
  )
# Error in rep(1:nrow(dataframe), expand.df) : invalid 'times' argument

When the data for the survival model has one line per observation, it works. But, as I previously said, I have time-varying covariates in the data that I want to use for the survival model, and I do not want to model these variables using the formLong argument.

I wanted to know if it possible to do this with INLAjoint (or maybe with R-INLA?).

Thanks!

DenisRustand commented 1 year ago

Hi,

I just made a couple changes in INLAjoint, the latest release should be able to deal with this model. Please update INLAjoint and add "dataSurv=pbc" to your joint() call and let me know if it works.

Best, Denis

NewUser36 commented 1 year ago

It works now with v23.05.04-2. Thank you!