Closed mafed closed 1 year ago
If to
is a indicator for a particular transition, it should be defined as a factor before fitting the flexsurvreg
model. In your dataset, msbmt
it is stored as numeric, so you cannot interpret the fitted model as a multi-state model with the transition number as a predictor.
However I could handle this better by checking whether tvar
is a factor before trying to read its levels in form.msm.newdata
, then giving an informative error message if it isn't.
I think that I should also have written help(msfit.flexsurvreg)
, section tvar
more clearly, to say e.g. "This should have been defined as a factor, with factor levels that correspond to elements of trans
".
You're right, adding msbmt$to <- as.factor(msbmt$to)
before model fitting did the trick. Now msfit.flexsurvreg
works but I noticed I should change the variable newd$to
to reflect that I assume transitions 2 and 3 to be proportional.
In the tutorial they achieve that using newd$strata <- c(1, 2, 2)
but I wouldn't know how to get the similar results here.
I've never tried to reproduce this tutorial, but that sounds like you want a model where transitions 2, 3 have a common shape parameter, that is different from the shape parameter for transition 1, but all transitions have a different scale parameter.
To implement that, you could create a factor for "transition 2 or 3", say, msbmt$to23 <- factor(msbmt$to %in% 2:3)
and place that as a covariate on the shape. Then in your flexsurvreg
model formula, use + to + shape(to23)
instead of + to + shape(to)
.
Also dist = "weibull"
is the accelerated failure time parameterisation of the Weibull. For the proportional hazards version you need "dist="weibullPH"
.
Thanks for the help, indeed you got what I want to obtain.
The problem is that the "to"
variable already takes only values 2 and 3 because is the destination of the transition, so the variable "to23"
would be always equal to TRUE
.
I thought about it a bit more and probably I'll obtain what I need without other modifications, apart from as.factor(to)
.
BTW, I want the transitions 2 and 3 to be proportional and that's the interpretation of the "pr"
variable that distinguish between between 2 and 3 despite they have the same shape parameter. Or are you saying that if I don't use the PH parametrisation of the Weibull I can't interpret the coefficient of "pr"
in that sense?
If $h_r(t)$ is the hazard function for transition r, $h_2(t)$ will only be proportional to $h_3(t)$ if you use a proportional hazards model. In an accelerated failure time model, identical shape parameters implies that the time acceleration coefficient will be constant with time (see the distributions vignette for the definition of this)
I'm OK if it has a different interpretation, thanks for pointing it out.
Though now that I have msfit.flexsurvreg
working I still don't understand why it doesn't give me the hazard for transition "1" of the "new patient" (see below code, modified).
In my actual project, that mimics this specific situation, I have 6 transition and the variable corresponding to "to"
(basically stratifying variable) is equal to c(1, 2, 3, 2, 4, 4)
and I use as "pr"
variable c(FALSE, FALSE, FALSE, TRUE, TRUE, FALSE)
. Meaning transition 2 and 4 have the same shape parameter as well as transition 5 and 6, the "pr"
variable serves to distinguish between these transitions, like in the tutorial.
The problem is that when I generate "newdata" with all covariates equal to zero, transition 5 and 6 are estimated to have zero hazard.
Is this still because "tvar"
needs to have the same factor labels of "trans"
?
If instead I use + trans + shape(to)
in the flexsurvreg
model formula, I get the hazard also for transition 1, basically forfeiting the use of the "pr"
time-varying covariate. This way I also should get the shape of transition 2 and 3 to be equal but I can still distinguish these transition by the trans parameters.
library(mstate)
library(flexsurv)
## take data from the 'mstate' tutorial
data("ebmt3")
tmat <- trans.illdeath(names = c("Tx", "PR", "RelDeath"))
covs <- c("dissub", "age", "drmatch", "tcd", "prtime")
msbmt <- msprep(time = c(NA, "prtime", "rfstime"),
status = c(NA, "prstat", "rfsstat"),
data = ebmt3,
trans = tmat,
keep = covs)
msbmt <- expand.covs(msbmt, covs, append = TRUE, longnames = FALSE)
## add time-varying covariate for PH transition, ref page 8 of tutorial
msbmt$pr <- msbmt$trans == 3
msbmt$to <- as.factor(msbmt$to)
table(msbmt$to)
## fit weibull model with 'same' formula
flex_c2 <- flexsurvreg(
Surv(Tstart, Tstop, status) ~
dissub1.1 + dissub2.1 +
age1.1 + age2.1 + drmatch.1 + tcd.1 + dissub1.2 + dissub2.2 +
age1.2 + age2.2 + drmatch.2 + tcd.2 + dissub1.3 + dissub2.3 +
age1.3 + age2.3 + drmatch.3 + tcd.3 +
pr + to + shape(to),
data = msbmt, dist = "weibull")
flex_c2
## preparing newdata
newd <- data.frame(dissub = rep(0, 3),
age = rep(0, 3),
drmatch = rep(0, 3),
tcd = rep(0, 3),
trans = as.factor(1:3))
newd$dissub <- factor(newd$dissub,
levels = 0:2,
labels = levels(ebmt3$dissub))
newd$age <- factor(newd$age,
levels = 0:2,
labels = levels(ebmt3$age))
newd$drmatch <- factor(newd$drmatch,
levels = 0:1,
labels = levels(ebmt3$drmatch))
newd$tcd <- factor(newd$tcd,
levels = 0:1,
labels = levels(ebmt3$tcd))
attr(newd, "trans") <- tmat
class(newd) <- c("msdata", "data.frame")
newd <- expand.covs(newd, covs[1:4], longnames = FALSE)
newd$to <- as.factor(c(2, 3, 3)) # same shape for trans 2 and 3
newd$pr <- c(FALSE, FALSE, TRUE) # time-varying covariate
## msfit.flexsurvreg gives $Haz == 0 for transition 1
msfit.flexsurvreg(flex_c2,
t = 5,
newdata = newd,
tvar = "to",
trans = tmat)
I don't see the purpose of pr
in the flexsurv model. As you say, + trans + shape(to)
achieves the model you said you wanted.
Perhaps the confusion here is because flexsurv is fully parametric. mstate
does things slightly differently, because the baseline hazard is nonparametric.
Required fixes done in https://github.com/chjackson/flexsurv-dev/commit/97cfcc9d2518024858f00ff98d86b279c6b40bec
I tried to reproduce the model
"c2"
from themstate
vignette withflexsurvreg
and the Weibull distribution. I found thatform.msm.newdata
gives an error, probably related to this lineHere below a Minimal Reproducible Example, I think it has to do with the variable names: it creates problems if you don't use the name
"trans"
neither in the formula nor as stratifying variable.