chjackson / msm

The msm R package for continuous-time multi-state modelling of panel data
https://chjackson.github.io/msm/
57 stars 17 forks source link

Working example of a phase-type HMM model #68

Open patricknnamdi opened 1 year ago

patricknnamdi commented 1 year ago

Hi Prof. Jackson,

I was wondering whether is an available working example of how to use the phase.states and phase.inits option to set up a hidden semi-Markov model.

chjackson commented 1 year ago

There isn't as far as I know, I'm afraid. In the meantime, I wonder if this commented code would be helpful to you or others.

This is a three-state non-hidden semi-Markov model, where the first state has two phases. This is equivalent to a specific hidden Markov model with four states. In the code, we simulate from the equivalent hidden Markov model, and fit the corresponding three-state phase-type model, and check that the estimated parameters are close to the true values used for simulation.

(a) Define a phase-type model and simulate data from it

mst1 <- 5 # Short-stay mean 
mst2 <- 30 # Long-stay mean 
p2 <- 0.9 # Long-stay probability 
q23 <- 1/5 # Transition rate between phases

l1 <- (p2/mst1)
mu1 <- (1-p2)/mst1
mu2 <- 1/(mst2-mst1)
Q <- rbind(c(0,l1,mu1*0.4,mu1*0.6),
           c(0,0,mu2*0.4,mu2*0.6),
           c(0,0,0,q23),
           c(0,0,0,0))
# Given the hidden state, the observed state is deterministic
E <- rbind(c(1,0,0,0),
           c(1,0,0,0),
           c(0,1,0,0),
           c(0,0,1,0))
nsubj <- 10000; nobspt <- 10
set.seed(1)
sim.df <- data.frame(subject = rep(1:nsubj, each=nobspt), time = seq(0, 100, length=nobspt))
sim2.df <- simmulti.msm(sim.df[,1:2], qmatrix=Q, ematrix=E)
statetable.msm(obs, subject, sim2.df)

(b) Fit the true phase-type model to the simulated data:

Q3 <- rbind(c(0,0.5,0.5),c(0,0,0.5),c(0,0,0))
s.msm <- msm(obs ~ time, subject=subject, data=sim2.df, phase.states=1, qmatrix=Q3, # default inits
             phase.inits=list(list(trans=0.05, 
                                   exit=matrix(c(0.1,0.1,0.1,0.1),nrow=2))),
             control=list(trace=1,REPORT=1,fnscale=50000,maxit=10000),method="BFGS")

# Parameter estimates should agree with the values used for simulation
s.msm 
c(l1, mu1*0.4, mu1*0.6, mu2*0.4, mu2*0.6, q23)
phasemeans.msm(s.msm)

I don't know if this is similar to what you wanted. A "hidden semi-Markov model" would be slightly more complex, in that there'd be extra emission/misclassification probabilities on top of this.

A danger with these models is that they often not identifiable from intermittently-observed data, so don't be surprised if they don't converge in your particular application.

chjackson commented 7 months ago

For anyone interested in phase-type semi-Markov models, there is now an experimental package msmbayes which implements these, and some of the other models in msm, using Bayesian inference in Stan. It is in the early stages of development, however, and users will need some Bayesian expertise to set up and interpret models with this package.