Open tbeason opened 2 years ago
In theory, you could get some asymptotic standard error estimates from the Hessian used in the numerical optimization (local_step = TRUE). However, this is at least currently not supported and would need some work as the nloptr
used for the numerical optimization does not return the Hessian. Of course, "manually" estimating the model with optim
and logLik
functions with hessian=TRUE
is possible.
If you have a reasonable amount of sequences, you could compute nonparametric bootstrap estimates though.
I do have a lot of sequences. In the lowest case, around 50,000. In the largest case, around 1 million. How would I go about doing the bootstrap? I am familiar with bootstrap methods in general, but this is my first trip into HMM methods so I am open to suggestions. My first thought would be to randomly select entire sequences (with replacement), so that the original sequence remains intact but the sample composition becomes random.
Sidenote: The parallelization works well! Estimation is not too slow even with a large number of sequences when using 64 cores. Thanks for that!
Yes, your strategy of sampling randomly entire sequences sounds right. In order to avoid potential issues with multiple (local) optima (as well as in order to speed the bootstrap), I suggest you use the estimated parameters as initial values in the bootstrap loop, i.e. you have your estimated model based on the original data, say mod
, and then you define boot_model <- build_hmm(boot_sequences, transition_probs = mod$transition_probs, emission_probs = mod$emission_probs, initial_probs = mod$initial_probs)
.
I see that it is possible get standard errors for covariates in MHMM models.
I am wondering if it is possible to get standard errors for the transition probabilities and emission probabilities in simple HMM models? Or even some type of confidence interval?