Open kkmann opened 1 year ago
@danielinteractive, you might find this interesting too.
Here is a suggestion (admittedly super inefficient) of how this could look like. One of the many problems is that it is super slow to not preallocate the return vecors. Maybe chunked pre-allocation could work. Ideally, the functionality could be integrated in the sim.fmsm()
function though.
library(flexsurv)
mod_nobos_bos <- flexsurvreg(Surv(years, status) ~ 1, subset=(trans==1),
data = bosms3, dist = "weibull")
mod_nobos_death <- flexsurvreg(Surv(years, status) ~ 1, subset=(trans==2),
data = bosms3, dist = "weibull")
mod_bos_death <- flexsurvreg(Surv(years, status) ~ 1, subset=(trans==3),
data = bosms3, dist = "weibull")
tmat <- rbind(c(NA, 1, 2), c(NA, NA, 3), c(NA, NA, NA))
mdl <- fmsm(mod_nobos_bos, mod_nobos_death, mod_bos_death, trans = tmat)
sim <- function(mdl, newdata = data.frame(..null = NA), start = 0, nsim = 1L) {
as.numeric(simulate(mdl, newdata = newdata, nsim = nsim)[1, 1:nsim])
}
simulate_fmsm <- function(mdl, state, newdata = data.frame(..null = NA),
start = 0, tmax = Inf) {
tmat <- attr(mdl, "trans")
terminal_states <- as.integer(which(rowSums(!is.na(tmat)) == 0))
from <- integer()
to <- integer()
t <- numeric()
tt <- 0
while (!(state %in% terminal_states) & (tt < tmax)) {
# sample next state
from <- c(from, state)
next_states <- as.integer(which(!is.na(tmat[state, ])))
next_trans_ids <- tmat[state, next_states]
# sample from transitions
times <- numeric(length(next_states))
for (i in seq_along(next_trans_ids)) {
trans <- next_trans_ids[i]
times[i] <- sim(mdl[[i]], newdata = newdata, start = start, nsim = 1L)
}
state <- next_states[which.min(times)]
to <- c(to, state)
tt <- tt + min(times)
t <- c(t, tt)
start <- sqrt(.Machine$double.eps) # future states are censored at 0
}
# combine vectors into a data frame and return
res <- cbind(from, to, t)
return(res)
}
simulate_fmsm(mdl, state = 1)
sim.fmsm
simulates the entire pathway through a multistate model though. Do I understand correctly that you want to simulate data where each individual spends a time period of at least "start"
in every one of their states, so that their sojourn times follow standard parametric distributions that are left-truncated by this time? That seems quite specialised. What is the application?
No, only the current state ('start' in sim.fmsm
) sojourn would be needed. Imagine simulating forward for each individual at an interim analysis in a CT setting. For simulate.flexsurvreg()
this is possible via the start
parameter (see #150) but sim.fmsm()
seems to use a slightly different mechanism for sampling (or I missed the point) and I couldn't find a way to specify the time already spent in the starting state. Also, the 'start' paramter has a different meaning in sim.fmsm()
. I think my example code achieves what I want (going directly via simulate.flexsurvreg()
) but to me the sim.fmsm()
function would be the natural place for this functionality.
Edit: also interesting link to #38.
Hi,
this is related to #150 - I am trying to sample forward from the predictive distribution of an individual given current state and sojourn time in that state for a multistate model.
sim.fmsm()
only seems to support sampling conditional on last state but cannot accept the respective sojourn time ('start' paramter insimulate.flexsurvreg()
).Would this be feasible to add to
sim.fmsm()
? I suspect the critical lines are aroundhttps://github.com/chjackson/flexsurv-dev/blob/96dfe2ea0a360d4a79996e3b390d7cd27e0ab265/R/mstate.R#L862