truncate HDP-based sampled model to fixed state size

mattjj commented 11 years ago

added in e08182c. see this example file, but the heart of it is

newmodel = model.truncate_num_states(target_num=10)

or

model.truncate_num_states(target_num=10,destructive=True) # destructive, in-place

I pushed to both dev and master. I'll leave this issue open until it's been run on real data (I haven't done that). I think it'd be best for @alexbw to try it now.

mattjj commented 11 years ago

Fixed a normalization issue in 67ae175 that wouldn't have caused any Viterbi errors but would have made incorrect any likelihoods computed before running Viterbi_EM_fit().

alexbw commented 11 years ago

On it.

alexbw commented 11 years ago

The script is jefferson:/home/alexbw/Code/test_truncation.py It draws in a saved model in /scratch/broken_truncation_test.py It imports pymouse, so add /home/alexbw/Code to your $PYTHONPATH

mattjj commented 11 years ago

Perfect, I am EAGER to fix this.

Matt

Sent from my phone

On Aug 25, 2013, at 11:25 AM, Alex Wiltschko notifications@github.com wrote:

The script is jefferson:/home/alexbw/Code/test_truncation.py It draws in a saved model in /scratch/broken_truncation_test.py It imports pymouse, so add /home/alexbw/Code to your $PYTHONPATH

— Reply to this email directly or view it on GitHub.

mattjj commented 11 years ago

Fixed in 16d985e (it was a left-censoring init state distn cache not being flushed).

alexbw commented 11 years ago

The truncate command works. Now getting traceback on the recommended call of Viterbi_EM_fit() afterwards:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-7-d83ab3b26678> in <module>()
      9     print i
     10     lhsmmModel_copy = deepcopy(lhsmmModel)
---> 11     lhsmmModel_copy.truncate(i, data, data_test)
     12     truncated_labels.append(lhsmmModel_copy.labels_)
     13     truncated_heldout_likelihoods.append(lhsmmModel_copy.heldout_sample_likelihoods_[0])

/Users/Alex/Code/pymouse/lhsmm.py in truncate(self, n_target_states, X, X_held_out)
    466         # Truncate the model
    467         self.hsmm_model.truncate_num_states(n_target_states, destructive=True)
--> 468         self.hsmm_model.Viterbi_EM_fit()
    469         print "TEST ONE TWO"
    470 

/Users/Alex/Code/pyhsmm_library_models/pyhsmm/models.pyc in Viterbi_EM_fit(self)
    275 
    276     def Viterbi_EM_fit(self):
--> 277         return self.MAP_EM_fit()
    278 
    279     def MAP_EM_step(self):

/Users/Alex/Code/pyhsmm_library_models/pyhsmm/basic/pybasicbayes/abstractions.pyc in MAP_EM_fit(self, tol, maxiter)
    202 
    203     def MAP_EM_fit(self,tol=1e-1,maxiter=100):
--> 204         return self._EM_fit(self.MAP_EM_step,tol=tol,maxiter=maxiter)
    205 
    206     @abc.abstractmethod

/Users/Alex/Code/pyhsmm_library_models/pyhsmm/basic/pybasicbayes/abstractions.pyc in _EM_fit(self, method, tol, maxiter)
    175         likes = []
    176         for itr in xrange(maxiter):
--> 177             method()
    178             likes.append(self.log_likelihood())
    179             if len(likes) > 1:

/Users/Alex/Code/pyhsmm_library_models/pyhsmm/models.pyc in MAP_EM_step(self)
    278 
    279     def MAP_EM_step(self):
--> 280         return self.Viterbi_EM_step()
    281 
    282     def Viterbi_EM_step(self):

/Users/Alex/Code/pyhsmm_library_models/library_models.pyc in Viterbi_EM_step(self)
    456 
    457     def Viterbi_EM_step(self):
--> 458         super(LibraryHSMMIntNegBinVariant,self).Viterbi_EM_step()
    459 
    460         # M step for duration distributions

/Users/Alex/Code/pyhsmm_library_models/library_models.pyc in Viterbi_EM_step(self)
    288 
    289         assert len(self.states_list) > 0, 'Must have data to run Viterbi EM'
--> 290         self.model._clear_caches()
    291 
    292         ## Viterbi step

AttributeError: 'LibraryHSMMIntNegBinVariant' object has no attribute 'model'

alexbw commented 11 years ago

I can close and open a separate issue if you'd like.

alexbw commented 11 years ago

Testing this a little further. Removed self.model._clear_caches() and replaced with self._clear_caches(), because that's what's used elsewhere in library_models.py. Restarted all kernels and retrying...

alexbw commented 11 years ago

Ok it's fixed, I'll push it

alexbw commented 11 years ago

I pushed to dev_dontwannamessitup ab22579b753947c7f377bf8654fdcb3a18c3571e. @mattjj , I don't know if me pushing to dev by myself is a good idea, given the shake-up of the submodules. If you give me the go ahead, I'll merge it into dev, but otherwise, will wait for your go-ahead.

mattjj commented 11 years ago

Perfect fix! Thank you! You can pull into dev (or I can, or whatever). You can push into dev whenever you feel like it in general, though we could even have dev-matt and dev-alex.

alexbw commented 11 years ago

Pushed

mattjj commented 11 years ago

We want to add a random truncation mode (and maybe annealing-like truncation).

alexbw commented 11 years ago

Random truncation is in there, testing it now.

mattjj commented 11 years ago

This is all going good.

dattalab / pyhsmm-library-models

truncate HDP-based sampled model to fixed state size #24