Closed alexbw closed 10 years ago
Wow, that's inside some weave'd code (hence the python27_compiled directory). I don't see anything in this trace that can tell us in what weave'd function the segfault happened.
It crashes before entering the resampling loop? Any more info on where exactly it might be dying? I can make some guesses but it'd be good to narrow it down.
(Maybe a stack array gets too big...)
Found the line. It's in sample_markov() in stats.py. My gut says there's an off-by-one here, but I will have to experiment.
(gdb) l
718 i++
719 ) ;
720 out[0] = i;
721
722 for (int t=1; t<T; t++) {
723 for (
724 i=0,val=((float)rand())/RAND_MAX;
725 i < N-1 && (val -= trans_matrix[N*out[t-1]+i]) > 1e-6;
726 i++
727 ) ;
Did you ever think you'd see the day when your transition matrices would be so big that you couldn't index them with 32-bit integers? Today's the day.
That was my guess for which method it would be.
That matrix is more than 4GB? Wow... I need to write a special forward sampling function for that... (in addition to the backwards posterior sampling function I wrote with sparse matrices that now needs faster sparse matrices).
Err that'd be indexing with an unsigned int, I guess it's about 2GB right now.
For the sake of the log, the trans_matrix is 2766760000 elements, 10.3GB. Big.
I'm on the new dev branch, with sparse matrices that will fit into memory. I'm getting the following error when running the parallel-library-subhmm.py
test. library-subhmm.py
works, though.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)<string> in <module>()
/home/abw11/Code/pyhsmm_library_models/pyhsmm/parallel.py in _call(f, data_id, **kwargs)
86 @engine_global_namespace
87 def _call(f,data_id,**kwargs):
---> 88 return f(my_data[data_id],**kwargs)
89
90 if engine_globals is not None:
/home/abw11/Code/pyhsmm_library_models/library_subhmm_models.py in _state_sampler(frozen_aBl, **kwargs)
117 data=frozen_aBl, # dummy
118 frozen_aBl=frozen_aBl,
--> 119 initialize_from_prior=False,temp=temp,**kwargs)
120 like = global_model.states_list[-1].log_likelihood()
121 big_stateseq = global_model.states_list.pop().big_stateseq
/home/abw11/Code/pyhsmm_library_models/pyhsmm/models.pyc in add_data(self, data, stateseq, trunc, right_censoring, left_censoring, **kwargs)
457 left_censoring=left_censoring,
458 trunc=trunc,
--> 459 **kwargs))
460
461 ### generation
/home/abw11/Code/pyhsmm_library_models/library_subhmm_models.pyc in __init__(self, model, data, frozen_aBl, **kwargs)
37 self._frozen_aBls = [frozen_aBl] * self.hsmm_trans_matrix.shape[0]
38 super(HSMMIntNegBinVariantFrozenSubHMMsStates,self).__init__(
---> 39 model=model,data=data,**kwargs)
40
41 # TODO compute likelihoods lazily? push this into aBls? why'd I break it
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in __init__(self, model, *args, **kwargs)
1119 def __init__(self,model,*args,**kwargs):
1120 self.model = model
-> 1121 super(HSMMIntNegBinVariantSubHMMsStates,self).__init__(model,*args,**kwargs)
1122 self.data = self.data.astype('float32',copy=False) if self.data is not None else None
1123 self._alphan = None
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in __init__(self, *args, **kwargs)
735
736 def __init__(self,*args,**kwargs):
--> 737 HSMMStatesPython.__init__(self,*args,**kwargs)
738
739 def clear_caches(self):
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in __init__(self, model, right_censoring, left_censoring, trunc, stateseq, **kwargs)
449 self.trunc = trunc
450
--> 451 super(HSMMStatesPython,self).__init__(model,stateseq=stateseq,**kwargs)
452
453 def _get_stateseq(self):
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in __init__(self, model, T, data, stateseq, initialize_from_prior, **kwargs)
31 else:
32 if data is not None and not initialize_from_prior:
---> 33 self.resample(**kwargs)
34 else:
35 self.generate_states()
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in resample(self, temp)
1343 def resample(self,temp=None):
1344 # TODO something with temperature
-> 1345 self._remove_substates_from_subHMMs()
1346 alphan = self.messages_forwards_normalized()
1347 self.sample_backwards_normalized(alphan)
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in _remove_substates_from_subHMMs(self)
1355
1356 def _remove_substates_from_subHMMs(self):
-> 1357 for superstate, states_obj in zip(self.stateseq_norep, self.substates_list):
1358 self.model.HMMs[superstate].states_list.remove(states_obj)
1359 self.substates_list = []
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in stateseq_norep(self)
463 @property
464 def stateseq_norep(self):
--> 465 if self._stateseq_norep is None:
466 self._stateseq_norep, self._durations_censored = rle(self.stateseq)
467 return self._stateseq_norep
AttributeError: 'HSMMIntNegBinVariantFrozenSubHMMsStates' object has no attribute '_stateseq_norep'
Btw you can just add a hasattr check there if you want. It should be set to None on object initialization, but that bit must have been lost in one of my changes. Checking hasattr is an easy local fix.
So to the start of the condition add
not hasattr(self,'_stateseq_norep') or ...
Matt
Sent from my phone
On Nov 6, 2013, at 10:01 AM, Alex Wiltschko notifications@github.com wrote:
I'm on the new dev branch, with sparse matrices that will fit into memory. I'm getting the following error when running the parallel-library-subhmm.py test. library-subhmm.py works, though.
AttributeError Traceback (most recent call last)
in () /home/abw11/Code/pyhsmm_library_models/pyhsmm/parallel.py in _call(f, data_id, kwargs) 86 @engine_global_namespace 87 def _call(f,data_id,kwargs): ---> 88 return f(my_data[data_id],kwargs) 89 90 if engine_globals is not None: /home/abw11/Code/pyhsmm_library_models/library_subhmm_models.py in _state_sampler(frozen_aBl, kwargs) 117 data=frozen_aBl, # dummy 118 frozen_aBl=frozen_aBl, --> 119 initialize_from_prior=False,temp=temp,kwargs) 120 like = global_model.states_list[-1].log_likelihood() 121 big_stateseq = global_model.states_list.pop().big_stateseq /home/abw11/Code/pyhsmm_library_models/pyhsmm/models.pyc in add_data(self, data, stateseq, trunc, right_censoring, left_censoring, _kwargs) 457 left_censoring=left_censoring, 458 trunc=trunc, --> 459 _kwargs)) 460 461 ### generation /home/abw11/Code/pyhsmm_library_models/library_subhmm_models.pyc in init(self, model, data, frozen_aBl, kwargs) 37 self._frozen_aBls = [frozen_aBl] * self.hsmm_trans_matrix.shape[0] 38 super(HSMMIntNegBinVariantFrozenSubHMMsStates,self).init( ---> 39 model=model,data=data,kwargs) 40 41 # TODO compute likelihoods lazily? push this into aBls? why'd I break it /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in init(self, model, _args, *_kwargs) 1119 def init(self,model,_args,kwargs): 1120 self.model = model -> 1121 super(HSMMIntNegBinVariantSubHMMsStates,self).init(model,_args,**kwargs) 1122 self.data = self.data.astype('float32',copy=False) if self.data is not None else None 1123 self._alphan = None /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in init__(self, _args, *_kwargs) 735 736 def init(self,*args,kwargs): --> 737 HSMMStatesPython.init(self,_args,__kwargs) 738 739 def clear_caches(self): /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in init*(self, model, right_censoring, left_censoring, trunc, stateseq, _kwargs) 449 self.trunc = trunc 450 --> 451 super(HSMMStatesPython,self).init(model,stateseq=stateseq,kwargs) 452 453 def _get_stateseq(self): /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in init(self, model, T, data, stateseq, initialize_from_prior, kwargs) 31 else: 32 if data is not None and not initialize_from_prior: ---> 33 self.resample(kwargs) 34 else: 35 self.generate_states() /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in resample(self, temp) 1343 def resample(self,temp=None): 1344 # TODO something with temperature -> 1345 self._remove_substates_from_subHMMs() 1346 alphan = self.messages_forwards_normalized() 1347 self.sample_backwards_normalized(alphan) /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in _remove_substates_from_subHMMs(self) 1355 1356 def _remove_substates_from_subHMMs(self): -> 1357 for superstate, states_obj in zip(self.stateseq_norep, self.substates_list): 1358 self.model.HMMs[superstate].states_list.remove(states_obj) 1359 self.substates_list = [] /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in stateseq_norep(self) 463 @property 464 def stateseq_norep(self): --> 465 if self._stateseq_norep is None: 466 self._stateseq_norep, self._durations_censored = rle(self.stateseq) 467 return self._stateseq_norep AttributeError: 'HSMMIntNegBinVariantFrozenSubHMMsStates' object has no attribute '_stateseq_norep' — Reply to this email directly or view it on GitHub.
One step further. Should _stateseq
and _stateseq_norep
be set to None ahead of time?
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)<string> in <module>()
/home/abw11/Code/pyhsmm_library_models/pyhsmm/parallel.py in _call(f, data_id, **kwargs)
86 @engine_global_namespace
87 def _call(f,data_id,**kwargs):
---> 88 return f(my_data[data_id],**kwargs)
89
90 if engine_globals is not None:
/home/abw11/Code/pyhsmm_library_models/library_subhmm_models.py in _state_sampler(frozen_aBl, **kwargs)
117 data=frozen_aBl, # dummy
118 frozen_aBl=frozen_aBl,
--> 119 initialize_from_prior=False,temp=temp,**kwargs)
120 like = global_model.states_list[-1].log_likelihood()
121 big_stateseq = global_model.states_list.pop().big_stateseq
/home/abw11/Code/pyhsmm_library_models/pyhsmm/models.pyc in add_data(self, data, stateseq, trunc, right_censoring, left_censoring, **kwargs)
457 left_censoring=left_censoring,
458 trunc=trunc,
--> 459 **kwargs))
460
461 ### generation
/home/abw11/Code/pyhsmm_library_models/library_subhmm_models.pyc in __init__(self, model, data, frozen_aBl, **kwargs)
37 self._frozen_aBls = [frozen_aBl] * self.hsmm_trans_matrix.shape[0]
38 super(HSMMIntNegBinVariantFrozenSubHMMsStates,self).__init__(
---> 39 model=model,data=data,**kwargs)
40
41 # TODO compute likelihoods lazily? push this into aBls? why'd I break it
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in __init__(self, model, *args, **kwargs)
1119 def __init__(self,model,*args,**kwargs):
1120 self.model = model
-> 1121 super(HSMMIntNegBinVariantSubHMMsStates,self).__init__(model,*args,**kwargs)
1122 self.data = self.data.astype('float32',copy=False) if self.data is not None else None
1123 self._alphan = None
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in __init__(self, *args, **kwargs)
735
736 def __init__(self,*args,**kwargs):
--> 737 HSMMStatesPython.__init__(self,*args,**kwargs)
738
739 def clear_caches(self):
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in __init__(self, model, right_censoring, left_censoring, trunc, stateseq, **kwargs)
449 self.trunc = trunc
450
--> 451 super(HSMMStatesPython,self).__init__(model,stateseq=stateseq,**kwargs)
452
453 def _get_stateseq(self):
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in __init__(self, model, T, data, stateseq, initialize_from_prior, **kwargs)
31 else:
32 if data is not None and not initialize_from_prior:
---> 33 self.resample(**kwargs)
34 else:
35 self.generate_states()
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in resample(self, temp)
1343 def resample(self,temp=None):
1344 # TODO something with temperature
-> 1345 self._remove_substates_from_subHMMs()
1346 alphan = self.messages_forwards_normalized()
1347 self.sample_backwards_normalized(alphan)
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in _remove_substates_from_subHMMs(self)
1355
1356 def _remove_substates_from_subHMMs(self):
-> 1357 for superstate, states_obj in zip(self.stateseq_norep, self.substates_list):
1358 self.model.HMMs[superstate].states_list.remove(states_obj)
1359 self.substates_list = []
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in stateseq_norep(self)
464 def stateseq_norep(self):
465 if not hasattr(self,"_stateseq_norep") or self._stateseq_norep is None:
--> 466 self._stateseq_norep, self._durations_censored = rle(self.stateseq)
467 return self._stateseq_norep
468
/home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in _get_stateseq(self)
452
453 def _get_stateseq(self):
--> 454 return self._stateseq
455
456 def _set_stateseq(self,stateseq):
AttributeError: 'HSMMIntNegBinVariantFrozenSubHMMsStates' object has no attribute '_stateseq'
Yes. That logic was all working at some point; not sure what happened to it.
Matt
Sent from my phone
On Nov 6, 2013, at 12:08 PM, Alex Wiltschko notifications@github.com wrote:
One step further. Should _stateseq and _stateseq_norep be set to None ahead of time?
AttributeError Traceback (most recent call last)
in () /home/abw11/Code/pyhsmm_library_models/pyhsmm/parallel.py in _call(f, data_id, kwargs) 86 @engine_global_namespace 87 def _call(f,data_id,kwargs): ---> 88 return f(my_data[data_id],kwargs) 89 90 if engine_globals is not None: /home/abw11/Code/pyhsmm_library_models/library_subhmm_models.py in _state_sampler(frozen_aBl, kwargs) 117 data=frozen_aBl, # dummy 118 frozen_aBl=frozen_aBl, --> 119 initialize_from_prior=False,temp=temp,kwargs) 120 like = global_model.states_list[-1].log_likelihood() 121 big_stateseq = global_model.states_list.pop().big_stateseq /home/abw11/Code/pyhsmm_library_models/pyhsmm/models.pyc in add_data(self, data, stateseq, trunc, right_censoring, left_censoring, _kwargs) 457 left_censoring=left_censoring, 458 trunc=trunc, --> 459 _kwargs)) 460 461 ### generation /home/abw11/Code/pyhsmm_library_models/library_subhmm_models.pyc in init(self, model, data, frozen_aBl, kwargs) 37 self._frozen_aBls = [frozen_aBl] * self.hsmm_trans_matrix.shape[0] 38 super(HSMMIntNegBinVariantFrozenSubHMMsStates,self).init( ---> 39 model=model,data=data,kwargs) 40 41 # TODO compute likelihoods lazily? push this into aBls? why'd I break it /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in init(self, model, _args, *_kwargs) 1119 def init(self,model,_args,kwargs): 1120 self.model = model -> 1121 super(HSMMIntNegBinVariantSubHMMsStates,self).init(model,_args,**kwargs) 1122 self.data = self.data.astype('float32',copy=False) if self.data is not None else None 1123 self._alphan = None /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in init__(self, _args, *_kwargs) 735 736 def init(self,*args,kwargs): --> 737 HSMMStatesPython.init(self,_args,__kwargs) 738 739 def clear_caches(self): /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in init*(self, model, right_censoring, left_censoring, trunc, stateseq, _kwargs) 449 self.trunc = trunc 450 --> 451 super(HSMMStatesPython,self).init(model,stateseq=stateseq,kwargs) 452 453 def _get_stateseq(self): /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in init(self, model, T, data, stateseq, initialize_from_prior, kwargs) 31 else: 32 if data is not None and not initialize_from_prior: ---> 33 self.resample(kwargs) 34 else: 35 self.generate_states() /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in resample(self, temp) 1343 def resample(self,temp=None): 1344 # TODO something with temperature -> 1345 self._remove_substates_from_subHMMs() 1346 alphan = self.messages_forwards_normalized() 1347 self.sample_backwards_normalized(alphan) /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in _remove_substates_from_subHMMs(self) 1355 1356 def _remove_substates_from_subHMMs(self): -> 1357 for superstate, states_obj in zip(self.stateseq_norep, self.substates_list): 1358 self.model.HMMs[superstate].states_list.remove(states_obj) 1359 self.substates_list = [] /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in stateseq_norep(self) 464 def stateseq_norep(self): 465 if not hasattr(self,"_stateseq_norep") or self._stateseq_norep is None: --> 466 self._stateseq_norep, self._durations_censored = rle(self.stateseq) 467 return self._stateseq_norep 468 /home/abw11/Code/pyhsmm_library_models/pyhsmm/internals/states.pyc in _get_stateseq(self) 452 453 def _get_stateseq(self): --> 454 return self._stateseq 455 456 def _set_stateseq(self,stateseq): AttributeError: 'HSMMIntNegBinVariantFrozenSubHMMsStates' object has no attribute '_stateseq' — Reply to this email directly or view it on GitHub.
This should have been a separate issue! I am going to open a separate issue so I can have the satisfaction of closing two issues.
This issue should be closed. I'm still using 32 bit indices, and it might be a good idea to typedef those to a 64-bit type, but since I never instantiate the fatty transition matrix anymore or linear index into it, these segfaults are fixed.
I keep bumping into a segmentation fault on Orchestra, SEAS and Jefferson.
It occurs all of a sudden — memory does not creep up, but the program crashes before resampling.
I have a stacktrace, but I have no experience reading these things, so I have no idea if it's useful. I'm going to keep digging with print statements to see the source.