probml / pmtk3

Probabilistic Modeling Toolkit for Matlab/Octave.
MIT License
1.55k stars 796 forks source link

hmmFitEm fails in simple case #49

Open ido opened 10 years ago

ido commented 10 years ago

From hughchri...@gmail.com on November 14, 2012 08:23:08

What steps will reproduce the problem? 1. Running MATLAB 2012A on win7.

  1. call model = hmmFitEm(X, 25, 'gauss'); where X = 1x8,000 vector of values between -3 and +3 on a grid with 0.25 spacing.
  2. Following output seen:

Error using chol Matrix must be positive definite.

Error in gaussLogprob (line 52) R = chol(Sigma);

Error in mixGaussInferLatent (line 17) logPz(:, k) = logMix(k) + gaussLogprob(mu(:, k), Sigma(:, :, k), X);

Error in mixGaussFit>estep (line 52) [weights, ll] = mixGaussInferLatent(model, data);

Error in emAlgo (line 62) [ess, ll] = estep(model, data);

Error in mixGaussFit (line 25) [model, loglikHist] = emAlgo(model, data, initFn, @estep, @mstep , ...

Error in hmmFitEm>initWithMixModel (line 244) mixModel = mixGaussFit(stackedData, nstates, 'verbose', false, 'maxIter', 10);

Error in hmmFitEm>initGauss (line 146) model = initWithMixModel(model, data);

Error in hmmFitEm>@(m,X,r)initFn(m,X,r,emissionPrior) (line 45) initFn = @(m, X, r)initFn(m, X, r, emissionPrior);

Error in emAlgo (line 56) model = init(model, data, restartNum);

Error in hmmFitEm (line 46) [model, loglikHist] = emAlgo(model, data, initFn, @estep, @mstep, EMargs{:});

model = hmmFitEm(X, config.K, 'gauss');

Original issue: http://code.google.com/p/pmtk3/issues/detail?id=49

ido commented 10 years ago

From hughchri...@gmail.com on November 21, 2012 06:34:19

I have this working now, but something funny is going on...

More general example than provided in hmmGaussTest.m is below:

nDays = 252; nObsPerDay = 855; z = 1; %z = 1 is univariate, z> 1 is multivariate thisPdf = @rand; data = repmat({thisPdf(z, nObsPerDay)}, [nDays 1]); kStates = 25; model = hmmFitEm(data, kStates, 'gauss');

ido commented 10 years ago

From hughchri...@gmail.com on November 21, 2012 06:54:55

Having said that when I try with "real data" i get the same error as above for chol(Sigma) with Sigma = 0, even though the synthetic data (with thisPdf = @randn) works fine.

any suggestions? thanks

ido commented 10 years ago

From hughchri...@gmail.com on November 21, 2012 07:20:49

The problem seems to be related to specifying too large a number of states for a given data set.

When Mu and Sigma can not be assigned correctly (ie are set to NaN or zero or -inf etc), they cause the code to fail in odd ways...

It would be nicer if at the point of assigning unsuitable parameters, the code gave a sensible error message saying that the data will not support this number of states.

The largest value for kStates I can get to run is 4, which does not seem very many. Zoubin G's code allows much larger values (~500) to run, but without really digging down into the code, I dont know why this is.