Closed rfechner closed 2 years ago
It should be said though, that in our case we should make use of prior gathered information by extracting and constructiong initial state probability vector as well as transition and emission matrices. Since the EM Algorithm finds only local optima, it is sensitive to its starting point. Thus, if we give the algorithm a better starting point, e.g. the transition/emission matrices, we obtain a good model alomst instantaniously. The question becomes if we should apply smoothing to the gathered information, and then train the model on the observation seqeunces or if we should simply set the emissions/transitions/init probabilities as is.
In the Literature, people have used a simple Left-Right model or a ergodic model as a baseline for their hidden Markov Model. I have not come across a publication, in which the probabilities have been extracted from data. This makes sense, as in a usual usecase of the hidden markov model we cannot observe the hidden state, thus we cannot gather information about state transitions or emissions.