dattalab / pyhsmm-library-models

library models built on top of pyhsmm
0 stars 1 forks source link

Model reinstantiation is broken #25

Closed alexbw closed 11 years ago

alexbw commented 11 years ago
import pyhsmm.basic.models
from pyhsmm.basic.distributions import GaussianFixed, NegativeBinomialIntegerRVariantDuration
from pyhsmm.util.general import rle
from pyhsmm.util.stats import cov
from pyhsmm.util.text import progprint_xrange

from pyhsmm_library_models.library_models import LibraryHSMMIntNegBinVariant, FrozenMixtureDistribution

from os.path import join
import json
import pymouse

datapath = "/Volumes/Data/ModelScan-8-13-2013/TMT_6-3-13_avg7frames_madeon_8-11-2013-fororchestra.npz"
f = np.load(datapath)
means = f['means']
sigmas = f['sigmas']
data = f['data']

modelpath = "/Volumes/Data/ModelScan-8-13-2013/7144_lhsmm_500_500_10000.0_1.0_avg7frames/"
labels = np.load(join(modelpath, "labels.npz"))['all_labels'][-1]
with open(join(modelpath, "model_info.json"), "r") as f:
    m = json.load(f)

lhsmmModel = pymouse.lhsmm.LHSMM(means, sigmas, n_states=m['n_states'], n_iter=100, n_iter_burnin=0, 
                frozen_a_0=m['frozen_a_0'], frozen_b_0=m['frozen_b_0'], 
                negbin_alpha_0=m['negbin_alpha_0'], negbin_beta_0=m['negbin_beta_0'], negbin_r=np.r_[0,0,1,1,1,1,1,1,1,1,1,1,1], 
                lhsmm_alpha_a_0=m['lhsmm_alpha_a_0'], lhsmm_alpha_b_0=m['lhsmm_alpha_b_0'], 
                lhsmm_gamma_a_0=m['lhsmm_gamma_a_0'], lhsmm_gamma_b_0=m['lhsmm_gamma_b_0'],
                left_censoring=True, parallel=False, n_samples_to_save=0,
                output_folder=None)
lhsmmModel.reinstantiate(data, labels)

ValueError Traceback (most recent call last)

in () 6 left_censoring=True, parallel=False, n_samples_to_save=0, 7 output_folder=None) ----> 8 lhsmmModel.reinstantiate(data, m['mode_labels']) /Users/Alex/Code/pymouse/lhsmm.pyc in reinstantiate(self, X, labels) 283 284 # Run the sampler once --> 285 self.hsmm_model.add_data(X, left_censoring=True, stateseq=labels) 286 self.hsmm_model.resample_obs_distns() 287 self.hsmm_model.resample_dur_distns() /Users/Alex/Code/pyhsmm_library_models/pyhsmm/models.pyc in add_data(self, data, stateseq, trunc, right_censoring, left_censoring, **kwargs) 443 left_censoring=left_censoring, 444 trunc=trunc, --> 445 **kwargs)) 446 447 def log_likelihood(self,data=None,trunc=None,**kwargs): /Users/Alex/Code/pyhsmm_library_models/library_models.pyc in **init**(self, data, precomputed_likelihoods, **kwargs) 154 if data is not None: 155 if precomputed_likelihoods is None: --> 156 precomputed_likelihoods = kwargs['model'].obs_distns[0].get_all_likelihoods(data) 157 self._likelihoods, self._shifted_likelihoods, self._maxes = precomputed_likelihoods 158 super(LibraryHSMMStatesIntegerNegativeBinomialVariant,self).__init__(data=data,**kwargs) /Users/Alex/Code/pyhsmm_library_models/library_models.pyc in get_all_likelihoods(self, data) 37 likelihoods = np.empty((data.shape[0],len(self.components))) 38 for idx, c in enumerate(self.components): ---> 39 likelihoods[:,idx] = c.log_likelihood(data) 40 maxes = likelihoods.max(axis=1) 41 shifted_likelihoods = np.exp(likelihoods - maxes[:,na]) /Users/Alex/Code/pyhsmm/basic/pybasicbayes/distributions.pyc in log_likelihood(self, x) 79 mu, sigma, D = self.mu, self.sigma, self.D 80 x = np.reshape(x,(-1,D)) - mu ---> 81 xs,LT = util.general.solve_chofactor_system(sigma,x.T,overwrite_b=True) 82 return -1./2. \* inner1d(xs.T,xs.T) - D/2_np.log(2_np.pi) - np.log(LT.diagonal()).sum() 83 /Users/Alex/Code/pyhsmm/basic/pybasicbayes/util/general.pyc in solve_chofactor_system(A, b, overwrite_b, overwrite_A) 11 def solve_chofactor_system(A,b,overwrite_b=False,overwrite_A=False): 12 L = np.linalg.cholesky(A) ---> 13 return scipy.linalg.solve_triangular(L,b,overwrite_b=overwrite_b,lower=True), L 14 15 def interleave(*iterables): /Users/Alex/anaconda/lib/python2.7/site-packages/scipy/linalg/basic.pyc in solve_triangular(a, b, trans, lower, unit_diagonal, overwrite_b, debug, check_finite) 153 154 if check_finite: --> 155 a1, b1 = map(np.asarray_chkfinite,(a,b)) 156 else: 157 a1, b1 = map(np.asarray, (a,b)) /Users/Alex/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.pyc in asarray_chkfinite(a, dtype, order) 588 if a.dtype.char in typecodes['AllFloat'] and not np.isfinite(a).all(): 589 raise ValueError( --> 590 "array must not contain infs or NaNs") 591 return a 592 ValueError: array must not contain infs or NaNs ``` ```
alexbw commented 11 years ago

The problem might be on my end, but I'm re-running code that I haven't changed, that worked previously. Looking into it.

alexbw commented 11 years ago

I believe the API we'd set was that I was supposed to pass in nans, and they'd get removed, right? I'm going up the chain, and in

return scipy.linalg.solve_triangular(L,b,overwrite_b=overwrite_b,lower=True), L L has no nans, but b is the original data, with nans in their original locations. https://github.com/mattjj/pyhsmm/blob/d586146f310bd36cffacfb83a6ca605166b535a9/util/general.py#L13

Going up one step, xs,LT = util.general.solve_chofactor_system(sigma,x.T,overwrite_b=True) x is just like the data, with nans in the original locations https://github.com/mattjj/pybasicbayes/blob/a50fb55024d9921f6bab990247541832df217cca/distributions.py#L81

I'll go back to earlier commits to find if and when the nans were originally removed.

alexbw commented 11 years ago

Right now, I'm just going to throw in a data[np.isnan(data).sum(1) > 0] = 0 before I call add_data.

mattjj commented 11 years ago

Can you grep for nan_to_num in the code you're working on? On my master and dev branches there's a call at line 149 of library_models.py which turns nans into zeros in _aBl (and hence they act like hidden data).

mattjj commented 11 years ago

Oh sorry, I read the issue badly; it's not getting there because a lower method is failing on nan's instead of just passing the nan's along. I think I fixed this issue in an update to pybasicbayes that I haven't pulled into this repo. I'm on it...

alexbw commented 11 years ago

This is working now.