probml / pmtk3

Probabilistic Modeling Toolkit for Matlab/Octave.
MIT License
1.55k stars 797 forks source link

EM does not monotonically increase likelihood when fitting HMM to single sparse sequence #22

Open ido opened 10 years ago

ido commented 10 years ago

From murphyk2 on April 03, 2011 10:12:48

What steps will reproduce the problem? a=[1,2,3,4,5,6,1,2,3,4,5,6,6,5,4,3,2,1,1,2,3,31,31,2,32,12,4,5,60,0,2,1,3,4,5,81,32,2,1];

[model, loglikHist] = hmmFit(a, 2, 'discrete');

(This example is due to George Toderici) What is the expected output? What do you see instead? The penalizd log likelihood should go up, but instead it gives a warning that it does not. Please use labels and text to provide additional information. The problem is that the data involves a non consecutive set of integers, spanning 0 to 81. Internally this gets canonized to 1..12. However,there is still some residual problem. Perhaps the log prior on the transmat is not being added to the objective function.

Original issue: http://code.google.com/p/pmtk3/issues/detail?id=22

ido commented 10 years ago

From RA.Dragun on April 03, 2011 23:24:27

I think (as far as I understood the problem) we can get some solution from http://sist.sysu.edu.cn/~syu/Publications/hsmmInitialize.m.txt where one can "translate" 0 to 81 -(to)> 1..12 (alphabet) so

sequence=observable_values(indexes of observable values)

ido commented 10 years ago

From murphyk2 on April 04, 2011 08:02:08

Yes, I added a 'canonizeLabels' command but I have not had time to test this thoroughly. The correct solution is to allow/require the user to specify the support of the alphabet of their data.