chjackson / msm

The msm R package for continuous-time multi-state modelling of panel data
https://chjackson.github.io/msm/
57 stars 17 forks source link

Baseline intensity constraint length when some transitions are never explicitly observed #74

Closed marc-vaisband closed 1 year ago

marc-vaisband commented 1 year ago

When specifying a model, there can be allowed transitions which are never explicitly observed in the data, but for which we would like to still estimate transition rates. For example, if we consider a four-state model, where we can have transitions (1, 2), (1, 3), (2, 4), (3, 4), we can additionally require that [(1, 2), (3, 4)] and [(1, 3), (2, 4)] each share a rate.

example_Q = rbind(c(0, 1, 1, 0),
          c(0, 0, 0, 1),
          c(0, 0, 0, 1),
          c(0, 0, 0, 0))

example_transition_df = data.frame(patient_ID = c(1, 1, 1, 2, 2, 3, 3), 
                                   observed_state = c(1, 2, 4, 1, 4, 1, 3), 
                                   years = c(0.0, 1.0, 2.0, 0.0, 1.0, 0.0, 1.0))

example_constraint_vector = c(1, 2, 2, 1)

However, if we try to fit this,

foo = msm(observed_state ~ years, subject=patient_ID, 
          data=example_transition_df, qmatrix=example_Q, 
          gen.inits=TRUE, qconstraint = example_constraint_vector)

it errors out with the message Error in msm.form.qmodel(qmatrix, qconstraint, analyticp, use.expm, phase.states) : baseline intensity constraint of length 4, should be 3.

Presumably this is because the rate for the (3, 4) transition is automatically dropped somewhere, so as a result there are more constraints than parameters. However, we do have information about it since we know it must be equal to the (1, 2) rate.

And assuming that the transition (3, 4) is impossible would mis-specify the model, as it would change the likelihood of observing a (1, 4) transition.

How should this case be handled?

chjackson commented 1 year ago

Thanks for the report. This is a consequence of some inappropriate behaviour by the crudeinits.msm procedure for generating initial values. If there are no observed direct data for a particular allowed transition, the initial value was getting set to zero. The qconstraint consistency check was being done after these initial values were set, which led to the error you saw. I have fixed this in the github version by using a small positive number for the initial value in this case. You can work around this in the CRAN version by using your own initial values in the msm call, instead of gen.inits=TRUE.