mattjj / pyhsmm

MIT License
546 stars 173 forks source link

LinAlgError: Matrix is not positive definite #43

Closed gxhrid closed 9 years ago

gxhrid commented 9 years ago

I feed many seqences data to pyhsmm. and want to use the meanfield inference method of HMM model.

But there always occures the "Matrix is not positive definite" exception, and the stack information is attached. My data are a little bit big and the programe is paralleled. So, it is very hard for me to treat this in a short time.

is there anything wrong in my raw data not to meet the model's data specification? or is this a inherenet error of this model? maybe I should change other random seed?

Traceback:

.......

(array([[ 59.11669952,  18.10745223]]), {}), (array([[ 59.30666302,  

18.12257222]]), {}), (array([[ 59.11669952,  18.10745223]]), {}), (array([[ 59.11669952,  18.10745223]]), {}), 

(array([[ 59.11669952,  18.10745223]]), {}), (array([[ 60.66855394,  17.11880207],
       [ 59....14228348],
       [ 60.66855394,  17.11880207]]), {}), (array([[ 59.11669952,  18.10745223]]), {}), (array([[ 59.11669952,  

18.10745223]]), {}), (array([[ 59.29484061,  18.08184385]]), {}), (array([[ 59.17498388,  18.14228348]]), {}), 

(array([[ 59.29484061,  18.08184385],
       [ 59.30662071,  18.12317617]]), {}), (array([[ 59.32020145,  18.07116866]]), {}), (array([[ 59.35720982,  

18.19441662]]), {}), (array([[ 35.91176766,  14.50242894],
       [ 35.89958409,  14.49356318]]), {}), (array([[ 58.33435573,  16.45424642]]), {}), (array([[ 56.67240501,  

16.35093927]]), {}), (array([[ 58.7847737 ,  16.92186356]]), {}), (array([[ 35.8981899,  14.5140532]]), {}), ...], 

...]
    579 
    580             for s, stats in zip(
    581                     [s for grp in list_split(states_list, num_procs) for s in grp],
    582                     [s for grp in allstats for s in grp]):

...........................................................................
/usr/local/python/lib/python3.4/site-packages/joblib-0.8.4-py3.4.egg/joblib/parallel.py in __call__(self=Parallel

(n_jobs=32), iterable=<generator object <genexpr>>)
    655             if pre_dispatch == "all" or n_jobs == 1:
    656                 # The iterable was consumed all at once by the above for loop.
    657                 # No need to wait for async callbacks to trigger to
    658                 # consumption.
    659                 self._iterating = False
--> 660             self.retrieve()
        self.retrieve = <bound method Parallel.retrieve of Parallel(n_jobs=32)>
    661             # Make sure that we get a last message telling us we are done
    662             elapsed_time = time.time() - self._start_time
    663             self._print('Done %3i out of %3i | elapsed: %s finished',
    664                         (len(self._output),

    ---------------------------------------------------------------------------
    Sub-process traceback:
    ---------------------------------------------------------------------------
    LinAlgError                                        Thu Apr  9 19:34:30 2015
PID: 24913                      Python 3.4.3: /usr/local/python/bin/python3
...........................................................................
/usr/local/python/lib/python3.4/site-packages/pyhsmm/parallel.py in _get_stats(idx=0)
     19     for data, kwargs in zip(datas,kwargss):
     20         model.add_data(data,stateseq=np.empty(data.shape[0]),**kwargs)
     21         states_list.append(model.states_list.pop())
     22 
     23     for s in states_list:
---> 24         s.meanfieldupdate()
        s.meanfieldupdate = <bound method HMMStatesEigen.meanfieldupdate of 

<pyhsmm.internals.hmm_states.HMMStatesEigen object>>
     25 
     26     return [s.all_expected_stats for s in states_list]
     27 
     28 def _get_sampled_stateseq(idx):

...........................................................................
/usr/local/python/lib/python3.4/site-packages/pyhsmm/internals/hmm_states.py in meanfieldupdate

(self=<pyhsmm.internals.hmm_states.HMMStatesEigen object>)
    450         self.stateseq = self.expected_states.argmax(1).astype('int32') # for plotting
    451 
    452     def meanfieldupdate(self):
    453         self.clear_caches()
    454         self.all_expected_stats = self._expected_statistics(
--> 455                 self.mf_trans_matrix,self.mf_pi_0,self.mf_aBl)
        self.mf_trans_matrix = array([[  1.15818989e-005,   1.15818989e-005,   ...,
          1.51828657e-004,   9.93934951e-001]])
        self.mf_pi_0 = array([  1.37364316e-113,   6.92310533e-127,   2...12,
         6.92310533e-127,   6.92310533e-127])
        self.mf_aBl = array([[ -1.40707318e+002,  -2.61806850e+004,  -...,
          1.32804846e-320,   1.58101007e-322]])
    456 
    457     def get_vlb(self):
    458         if self._normalizer is None:
    459             self.meanfieldupdate() # NOTE: sets self._normalizer

...........................................................................
/usr/local/python/lib/python3.4/site-packages/pyhsmm/internals/hmm_states.py in mf_aBl

(self=<pyhsmm.internals.hmm_states.HMMStatesEigen object>)
    424         if self._mf_aBl is None:
    425             T = self.data.shape[0]
    426             self._mf_aBl = aBl = np.empty((T,self.num_states))
    427 
    428             for idx, o in enumerate(self.obs_distns):
--> 429                 aBl[:,idx] = o.expected_log_likelihood(self.data).ravel()
        aBl = array([[ -1.40707318e+002,  -2.61806850e+004,  -...,
          1.32804846e-320,   1.58101007e-322]])
        idx = 15
        o.expected_log_likelihood = <bound method Gaussian.expected_log_likelihood of 

<pybasicbayes.distributions.Gaussian object>>
        self.data.ravel = <built-in method ravel of numpy.ndarray object>
    430             aBl[np.isnan(aBl).any(1)] = 0.
    431 
    432         return self._mf_aBl
    433 

...........................................................................
/usr/local/python/lib/python3.4/site-packages/pybasicbayes/distributions.py in expected_log_likelihood

(self=<pybasicbayes.distributions.Gaussian object>, x=array([[-15.05803229, -31.72762012],
       [ -6....55570334],
       [ -6.30502495, -28.55746006]]))
    668 
    669     def expected_log_likelihood(self,x):
    670         mu_n, sigma_n, kappa_n, nu_n = self.mu_mf, self.sigma_mf, self.kappa_mf, self.nu_mf
    671         D = len(mu_n)
    672         x = np.reshape(x,(-1,D)) - mu_n # x is now centered
--> 673         xs = np.linalg.solve(self.sigma_mf_chol,x.T)
        xs = undefined
        self.sigma_mf_chol = undefined
        x.T = array([[-15.05803229,  -6.00951302,  -6.32170992...      -28.57186493, -28.55570334, -28.55746006]])
    674 
    675         # see Eqs. 10.64, 10.67, and 10.71 in Bishop
    676         return self._loglmbdatilde()/2 - D/(2*kappa_n) - nu_n/2 * \
    677                 inner1d(xs.T,xs.T) - D/2*np.log(2*np.pi)

...........................................................................
/usr/local/python/lib/python3.4/site-packages/pybasicbayes/distributions.py in sigma_mf_chol

(self=<pybasicbayes.distributions.Gaussian object>)
    643         self._sigma_mf_chol = None
    644 
    645     @property
    646     def sigma_mf_chol(self):
    647         if self._sigma_mf_chol is None:
--> 648             self._sigma_mf_chol = np.linalg.cholesky(self.sigma_mf)
        self._sigma_mf_chol = None
        self.sigma_mf = array([[  1.10661308e+107,   5.16037189e+106],
       [  5.27504682e+106,   1.72012396e+106]])
    649         return self._sigma_mf_chol
    650 
    651     def get_vlb(self):
    652         D = len(self.mu_0)

...........................................................................
/root/Downloads/matplotlib-1.4.3/.eggs/numpy-1.9.2-py3.4-linux-x86_64.egg/numpy/linalg/linalg.py in cholesky

(a=array([[  1.10661308e+107,   5.16037189e+106],
       [  5.27504682e+106,   1.72012396e+106]]))
    598     a, wrap = _makearray(a)
    599     _assertRankAtLeast2(a)
    600     _assertNdSquareness(a)
    601     t, result_t = _commonType(a)
    602     signature = 'D->D' if isComplexType(t) else 'd->d'
--> 603     return wrap(gufunc(a, signature=signature, extobj=extobj).astype(result_t))
        wrap = <built-in method __array_prepare__ of numpy.ndarray object>
        gufunc = <ufunc 'cholesky_lo'>
        a = array([[  1.10661308e+107,   5.16037189e+106],
       [  5.27504682e+106,   1.72012396e+106]])
        signature = 'd->d'
        extobj = [8192, 1536, <function _raise_linalgerror_nonposdef>]
        extobj.astype = undefined
        result_t = <class 'numpy.float64'>
    604 
    605 # QR decompostion
    606 
    607 def qr(a, mode='reduced'):

...........................................................................
/root/Downloads/matplotlib-1.4.3/.eggs/numpy-1.9.2-py3.4-linux-x86_64.egg/numpy/linalg/linalg.py in 

_raise_linalgerror_nonposdef(err='invalid value', flag=8)
     88 
     89 def _raise_linalgerror_singular(err, flag):
     90     raise LinAlgError("Singular matrix")
     91 
     92 def _raise_linalgerror_nonposdef(err, flag):
---> 93     raise LinAlgError("Matrix is not positive definite")
     94 
     95 def _raise_linalgerror_eigenvalues_nonconvergence(err, flag):
     96     raise LinAlgError("Eigenvalues did not converge")
     97 

LinAlgError: Matrix is not positive definite
mattjj commented 9 years ago

Linear algebra errors are probably data-dependent. Would it be possible to send me a script and a data file that reproduce this error so I can check it out?

Generally speaking, increasing the strength of the prior can make things better conditioned by boosting the diagonals of these matrices. What are your prior hyperparameters?

gxhrid commented 9 years ago

I have sent the corespond materials to reproduce this issue in E-maiil.

Thank you for your remiding of chaging the prior hyperparameters. I will try this.

gxhrid commented 9 years ago

In light of your reminding, I have chaged the prior hyperparameters from

obs_hypparams = dict(mu_0=np.zeros(2),sigma_0=np.eye(2),kappa_0=0.2,nu_0=5)

to using the mean and std of data to init the hyperparams as following,

temp = np.vstack(data)
mu_0 = np.mean(temp, 0)
sigma_0 = np.eye(2) * np.std(temp, 0) ** 2
del temp
obs_hypparams = dict(mu_0=mu_0, sigma_0=sigma_0, kappa_0=0.2,nu_0=5)

It is run well now. Previously, I think the prior is only play a role of regularization, which does not matters especially for the big data scenario.

the entire script :

num_states = 20

data = []
data.append(np.array([[ 46.09624038, -60.75663923]] ))
data.append(np.array([[ 45.67615317, -62.71079608],
        [ 46.42577037, -62.59713028],
        [ 46.23305503, -62.61693047]]))
data.append(np.array([[ 47.39252692, -61.78993208]] ))
data.append(np.array([[ 45.94008088, -66.61341137],
        [ 46.81245472, -71.2273164 ]]))
data.append(np.array([[ 45.44950604, -73.74036312]] ))

temp = np.vstack(data)
mu_0 = np.mean(temp, 0)
sigma_0 = np.eye(2) * np.std(temp, 0) ** 2
del temp

obs_hypparams = dict(mu_0=mu_0, sigma_0=sigma_0, kappa_0=0.2,nu_0=5)
hmm = models.HMM(
        obs_distns=[distributions.Gaussian(**obs_hypparams) for i in range(num_states)],
        alpha=10.,init_state_concentration=1.)

vlb = []

for i in range(10):

    vlb.append(hmm.meanfield_batch_cdstep(data, num_procs = 0))