Closed jscott6 closed 3 years ago
Suppose initial values for R are low, so that the epidemic subsides and expected observations are close to zero. The log-likelihood is then very low if actual observations are positive (say in 100s/1000s). The model appears to prefer pushing the intercept for R very high and having extremely large expected observations, say 1e50. This then acts as a local mode that is hard to escape.
Have identified this as a leading cause of chains not converging in the models, and is especially pronounced when there are multiple groups involved, and when there are over-dispersed observations using negative binomial family.
Proposed simple solution is to use init = 0 in sampling_args argument to epim(). Could make this the default. With log-link for example this leads to an initial R of 1 for all time periods.
init_r = 1e-6 is now being used by default for sampling, which appears to be a much better starting point for HMC
Many sampled parameters go through a standard logistic transform. The default stan initialization scheme can lead to poor starting values and divergent transitions for our model. Could write a custom function for getting starting values.