Sampling after unfreezing fails

alexbw commented 11 years ago

I am unfreezing the model like so:

self.hsmm_model = deepcopy(self.hsmm_model.unfreeze())

The traceback I get is:

/Users/Alex/Code/pymouse/lhsmm.py in fit(self, X, y, X_held_out)
    461                 last_time = this_time
    462                 if self.parallel == False:
--> 463                     this_sample = self.hsmm_model.resample_and_copy()
    464                     if itr > self.n_iter_unfreeze - self.n_samples_to_save:
    465                         self.samples_.append(this_sample)

/Users/Alex/Code/pyhsmm_library_models/pyhsmm/basic/pybasicbayes/abstractions.pyc in resample_and_copy(self)
    185
    186     def resample_and_copy(self):
--> 187         self.resample_model()
    188         return self.copy_sample()
    189

/Users/Alex/Code/pyhsmm_library_models/pyhsmm/models.pyc in resample_model(self, **kwargs)
    471     def resample_model(self,**kwargs):
    472         self.resample_dur_distns()
--> 473         super(HSMM,self).resample_model(**kwargs)
    474
    475     def resample_dur_distns(self):

/Users/Alex/Code/pyhsmm_library_models/pyhsmm/models.pyc in resample_model(self, temp)
    155     def resample_model(self,temp=None):
    156         self._last_resample_used_temp = temp is not None and temp != 1
--> 157         self.resample_obs_distns()
    158         self.resample_trans_distn()
    159         self.resample_init_state_distn()

/Users/Alex/Code/pyhsmm_library_models/pyhsmm/models.pyc in resample_obs_distns(self)
    162     def resample_obs_distns(self):
    163         for state, distn in enumerate(self.obs_distns):
--> 164             distn.resample([s.data[s.stateseq == state] for s in self.states_list])
    165         self._clear_caches()
    166

/Users/Alex/Code/pyhsmm_library_models/pyhsmm/basic/pybasicbayes/models.pyc in resample(self, data, niter, temp)
    282
    283             for itr in range(niter):
--> 284                 self.resample_model(temp=temp)
    285
    286             self.labels_list.pop()

/Users/Alex/Code/pyhsmm_library_models/pyhsmm/basic/pybasicbayes/models.pyc in resample_model(self, temp)
     76     def resample_model(self,temp=None):
     77         assert all(isinstance(c,GibbsSampling) for c in self.components), \
---> 78                 'Components must implement GibbsSampling'
     79         for idx, c in enumerate(self.components):
     80             c.resample(data=[l.data[l.z == idx] for l in self.labels_list])

AssertionError: Components must implement GibbsSampling

mattjj commented 11 years ago

I wouldn't call deepcopy like that (though it shouldn't hurt, and can't possibly be causing this issue!). Just do hsmm_model.unfreeze() like here.

Are you on the dev branch of pyhsmm-library-models? Can you try running correctness_tests/unfreeze.py? That one works on my laptop. (Don't forget submodule update --recursive !)

alexbw commented 11 years ago

The correctness test works. I've tried both deepcopy() and no deepcopy(). That was just a "lets be super sure it's different" test. What other information can I gather?

On Sun, Sep 15, 2013 at 9:53 PM, Matthew Johnson notifications@github.comwrote:

I wouldn't call deepcopy like that. Just do hsmm_model.unfreeze() like herehttps://github.com/dattalab/pyhsmm-library-models/blob/dev/correctness_tests/unfreeze.py#L83 .

Are you on the dev branch of pyhsmm-library-models? Can you try running correctness_tests/unfreeze.py? That one works on my laptop. (Don't forget submodule update --recursive !)

— Reply to this email directly or view it on GitHubhttps://github.com/dattalab/pyhsmm-library-models/issues/28#issuecomment-24485519 .

mattjj commented 11 years ago

Hrmmmmm not sure what's up so that the correctness test works but the real stuff doesn't... I guess just delete the assertions!

alexbw commented 11 years ago

Appears to work without the assertion. Should I commit that, or leave it local?

On Sun, Sep 15, 2013 at 10:23 PM, Matthew Johnson notifications@github.comwrote:

Hrmmmmm not sure what's up so that the correctness test works but the real stuff doesn't... I guess just delete the assertions!

— Reply to this email directly or view it on GitHubhttps://github.com/dattalab/pyhsmm-library-models/issues/28#issuecomment-24486165 .

mattjj commented 11 years ago

Push it!

On Sun, Sep 15, 2013 at 10:42 PM, Alex Wiltschko notifications@github.comwrote:

Appears to work without the assertion. Should I commit that, or leave it local?

On Sun, Sep 15, 2013 at 10:23 PM, Matthew Johnson notifications@github.comwrote:

Hrmmmmm not sure what's up so that the correctness test works but the real stuff doesn't... I guess just delete the assertions!

— Reply to this email directly or view it on GitHub< https://github.com/dattalab/pyhsmm-library-models/issues/28#issuecomment-24486165>

.

— Reply to this email directly or view it on GitHubhttps://github.com/dattalab/pyhsmm-library-models/issues/28#issuecomment-24486518 .

mattjj commented 11 years ago

Assertions are often worthless. I put them in to try to express type flow, but it just can't be done in Python 2.

alexbw commented 11 years ago

DUDE it takes a long time to run. This is not going to be feasible in practice, I don't think.

On Sun, Sep 15, 2013 at 11:07 PM, Matthew Johnson notifications@github.comwrote:

Assertions are often worthless. I put them in to try to express type flow, but it just can't be done in Python 2.

— Reply to this email directly or view it on GitHubhttps://github.com/dattalab/pyhsmm-library-models/issues/28#issuecomment-24486972 .

alexbw commented 11 years ago

If it's linear in state length, is it also linear in data length? That might be the hitch. What else could be causing this blow up? The last iteration was 20 seconds before unfreezing. This iteration post-unfreezing is going on ten minutes.

mattjj commented 11 years ago

Yes also linear in data. This is why I wrote all that special-purpose library model code with the likelihoods and the messages and shit!

We could profile to see the bottlenecks but I agree it sounds like it won't work.

Matt

Sent from my phone

On Sep 15, 2013, at 11:26 PM, Alex Wiltschko notifications@github.com wrote:

If it's linear in state length, is it also linear in data length? That might be the hitch. What else could be causing this blow up? The last iteration was 20 seconds before unfreezing. This iteration post-unfreezing is going on ten minutes.

— Reply to this email directly or view it on GitHub.

mattjj commented 11 years ago

Btw I'm working on stochastic gradient mean field stuff precisely to avoid this wasteful inference-on-all-data-every-step thing, so another bit of silver lining is that my work is important!

We do have options to speed this up:

Resample only a subset of the state sequences every time
Use the parallel code I wrote to resample stateseqs and obs distns (it would be nice to put that to use, it was a lot of work!)

But given our current understanding of what the model is doing, I don't think this is worth our time at the moment.

Matt

Sent from my phone

On Sep 15, 2013, at 11:26 PM, Alex Wiltschko notifications@github.com wrote:

If it's linear in state length, is it also linear in data length? That might be the hitch. What else could be causing this blow up? The last iteration was 20 seconds before unfreezing. This iteration post-unfreezing is going on ten minutes.

— Reply to this email directly or view it on GitHub.

alexbw commented 11 years ago

New problem.

/Users/Alex/Code/pyhsmm_library_models/pyhsmm/basic/pybasicbayes/abstractions.pyc in resample_and_copy(self)
    186     def resample_and_copy(self):
    187         self.resample_model()
--> 188         return self.copy_sample()
    189
    190 class ModelMeanField(Model):

/Users/Alex/Code/pyhsmm_library_models/pyhsmm/models.pyc in copy_sample(self)
    485
    486     def copy_sample(self):
--> 487         new = super(HSMM,self).copy_sample()
    488         new.dur_distns = [d.copy_sample() for d in self.dur_distns]
    489         return new

/Users/Alex/Code/pyhsmm_library_models/pyhsmm/models.pyc in copy_sample(self)
    181         new.obs_distns = [o.copy_sample() for o in self.obs_distns]
    182         new.trans_distn = self.trans_distn.copy_sample()
--> 183         new.init_state_distn = self.init_state_distn.copy_sample()
    184         new.states_list = [s.copy_sample(new) for s in self.states_list]
    185         return new

alexbw commented 11 years ago

Do we care about unfreezing still? It seems like that would have been a temporary stopgap to get around model shortcomings that we've more concretely identified since this issue was opened and subsequently stalled.

dattalab / pyhsmm-library-models

Sampling after unfreezing fails #28