Iteration limit in optim() reached without convergence

gezelle-d commented 1 week ago

Hi there,

I'm hoping to get some advice on this convergence issue. I've read previous posts regarding similar issues and have tried a number of things to resolve it including removing participants with minimal data, scaling my variables, using months instead of years, increasing the maxit (up to 50000) and adjusting the fnscale to the log-likelihood value. Some advice on this would be much appreciated! Thank you :)

library(msm)

msm_df <- read.csv("/Users/msm_df.csv")

# create model
qmatrix <- matrix(c(
  0, 0, 1, 1, # never (never has to transition through states 3 or 4 to get to state 2)
  0, 0, 1, 1, # non-current
  0, 1, 0, 1, # occasional
  0, 1, 1, 0  # daily
), nrow = 4, ncol = 4, byrow = TRUE)

msm_df$dd_score_scaled <- as.numeric(scale(msm_df$dd_score))
msm_df$years_scaled <- as.numeric(scale(msm_df$years))

# run model
ms_model <- msm(status ~ years_scaled, subject = participant_id,
              data = msm_df, qmatrix = qmatrix, gen.inits = T, covariates = ~ dd_score_scaled, control=list(fnscale=6301, reltol = 1e-16, maxit = 5000)) 
#> Warning in msm.check.times(time, subject, state): Subjects 1274, 22453, 75717
#> and others only have one complete observation, which doesn't give any
#> information
#> Warning in msm.check.times(mf$"(time)", mf$"(subject)", mf$"(state)"): Subjects
#> 1274, 22453, 75717 and others only have one complete observation, which doesn't
#> give any information
#> Warning in msm.optim.optim(p = list(inits = c(qbase = -2.26434824925168, :
#> Iteration limit in optim() reached without convergence. Reported estimates are
#> not the maximum likelihood. Increase "maxit" or change optimisation method -
#> see help(optim) and help(msm).
print(ms_model)
#> 
#> Call:
#> msm(formula = status ~ years_scaled, subject = participant_id,     data = msm_df, qmatrix = qmatrix, gen.inits = T, covariates = ~dd_score_scaled,     control = list(fnscale = 6301, reltol = 1e-16, maxit = 5000))
#> 
#> Maximum likelihood estimates
#> Baselines are with covariates set to their means
#> 
#> Transition intensities with hazard ratios for each covariate
#>                   Baseline                          
#> State 1 - State 1 -5.030e-01 (-5.436e-01,-4.654e-01)
#> State 1 - State 3  5.029e-01 ( 4.652e-01, 5.436e-01)
#> State 1 - State 4  9.533e-05 ( 3.967e-13, 2.291e+04)
#> State 2 - State 2 -3.017e+00 (-5.111e+00,-1.781e+00)
#> State 2 - State 3  3.016e+00 ( 1.780e+00, 5.111e+00)
#> State 2 - State 4  7.439e-04 ( 1.575e-12, 3.515e+05)
#> State 3 - State 2  6.345e+00 ( 3.841e+00, 1.048e+01)
#> State 3 - State 3 -7.352e+00 (-1.136e+01,-4.759e+00)
#> State 3 - State 4  1.007e+00 ( 8.688e-01, 1.167e+00)
#> State 4 - State 2  5.415e-04 ( 7.744e-08, 3.786e+00)
#> State 4 - State 3  3.929e-01 ( 2.998e-01, 5.149e-01)
#> State 4 - State 4 -3.934e-01 (-5.133e-01,-3.015e-01)
#>                   dd_score_scaled                
#> State 1 - State 1                                
#> State 1 - State 3 1.060e+00 (9.866e-01,1.140e+00)
#> State 1 - State 4 9.358e-01 (7.599e-06,1.153e+05)
#> State 2 - State 2                                
#> State 2 - State 3 1.501e+00 (6.415e-01,3.513e+00)
#> State 2 - State 4 1.328e+00 (1.090e-03,1.618e+03)
#> State 3 - State 2 1.436e+00 (6.229e-01,3.308e+00)
#> State 3 - State 3                                
#> State 3 - State 4 1.203e+00 (1.025e+00,1.412e+00)
#> State 4 - State 2 3.893e-05 (1.262e-11,1.201e+02)
#> State 4 - State 3 1.182e+00 (9.714e-01,1.437e+00)
#> State 4 - State 4                                
#> 
#> -2 * log-likelihood:  6301.304

^{Created on 2024-09-26 with reprex v2.1.1}

chjackson commented 1 week ago

There won't necessarily be a way to "force" it to converge - sometimes the model is just not identifiable from the data. This commonly happens in models like this one where there are lots of transitions allowed in continuous time but the state is only observed intermittently.

I guess you are modelling smoking behaviour (or something like that), and the states match how your data are measured? A challenge here is that msm uses a continuous-time model, but in this sort of application, it is difficult to define the exact moment in continuous time when a person changes state.

Would it be possible to interpret the state as representing a quantity that has to change smoothly over time, say "intensity of smoking"? You could then disallow the direct transition from "never" to "daily", and from "daily" to "non-current". However, this model would still allow these transitions for any discrete time unit (e.g. from one day to the next).

To represent covariate effects, although this model wouldn't give hazard ratios on the never-daily transition intensity, you could still calculate the never-daily transition probability (for the time unit of your choice) for different values of the covariate.

gezelle-d commented 1 week ago

Thank you for your prompt response. I removed the direct transitions from never to daily, and from daily to non-current, however I'm now encountering a different warning message: "Optimisation has probably not converged to the maximum likelihood - Hessian is not positive definite." I tried running it without the covariate and just get the same warning message that I was initially getting "Iteration limit in optim() reached without convergence." Any further suggestions?

chjackson commented 1 week ago

Try also removing the non-current -> daily direct transition, because you could also assume this goes via "occasional". Look at the confidence intervals in your original output - these three rates are those for which there is no information in the data. If that doesn't work, play around with fixing some rates at sensible values and see if you can estimate the others, and/or explore fitting subsets of the model to subsets of the data.

gezelle-d commented 1 week ago

Removal of non-current to daily seems to have fixed things! Thank you so much for your help.

chjackson / msm

Iteration limit in optim() reached without convergence #105