mattjj / pyhsmm

MIT License
547 stars 173 forks source link

Parallel resampling of observation distributions #9

Closed inti closed 10 years ago

inti commented 11 years ago

Hi Matt, I have created a resample_model_parallel2 function which resamples the observation distributions in parrallel. This was at the cost of pushing the model into the engines an additional time. However, on my test this had very minor overload. I have added code to parallel.py .. that was easy!!! given the code you had in there already.

In my initial test with Nmax = 100 the execution time came down to half :)

i did not see a way to parallelise the sampling of the trans_dist but let me know if you think it is possible I can work on it. resampling of the initial state is super quick anyways so no point in speeding that up.

Thanks a lot!

Inti

inti commented 11 years ago

A bit more on timings. on larger piece of data with 11 data sequences the resample_model_parallel2 function is about 10-15% faster but both parallel functions are about 5 times slower than the non-parallel version ...

i am looking more into this.

mattjj commented 11 years ago

I like your pull request! Should I merge it now, or wait for these general timing issues to be tracked down?

That line shouldn't be reloading the data; it should just be building up new states objects given the state sequences returned and the data that is already loaded locally. It should just be copying pointers.

Is that line definitely slow? If it is then I'm wrong about something… I suppose instead of building a new object we could re-use the old one and just set its stateseq pointer; that might avoid some object-creation overhead, though it would be good to understand if the object creation is really the problem.

On Apr 17, 2013, at 8:36 AM, Inti Pedroso notifications@github.com wrote:

I think I may have found the issue.

def _build_states_parallel(self,states_to_resample): from pyhsmm import parallel raw_stateseq_tuples = parallel.build_states.map([s.data_id for s in states_to_resample]) for data_id, stateseq in raw_stateseq_tuples: self.add_data(data=parallel.alldata[data_id],stateseq=stateseq) self.states_list[-1].data_id = data_id is it necessary to reload the data each time the statesequence is updated? I would have though it is enough with

self.states_list[data_id].stateseq=stateseq I think this is making this slower. Inti

— Reply to this email directly or view it on GitHub.

inti commented 11 years ago

hi,

self.add_data(data=parallel.alldata[data_id],stateseq=stateseq)

it is not really the problem I think. pushing the data into 8 local engines takes 0.02 seconds. so that is not the issue neither.

I still wonder why the parallel code, either the original or the one above, are slower. If I resample the obsv_distns in parallel and the states locally then it takes 50 % than the non-parallel version and ~ 1/3 of the parallel ones. I have already increased the amount of data and it does not change the picture much.

I also tried with

def rss(s):
    s.resample()
    return s
parallel.dv.map_sync(rss, [ s for s in negative_binomial_model.states_list ])

but I am getting the following error

[0:apply]: 
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)<string> in <module>()
<ipython-input-2-37caae605d17> in rss(s)
/home/inti/APP/pyhsmm/internals/states.pyc in resample(self)
    103 
    104     def resample(self):
--> 105         betal = self.messages_backwards()
    106         self.sample_forwards(betal)
    107 
/home/inti/APP/pyhsmm/internals/states.pyc in messages_backwards(self)
     81 
     82     def messages_backwards(self):
---> 83         return self._messages_backwards(self.model.trans_distn.A,self.aBl)
     84 
     85     @staticmethod
/home/inti/APP/pyhsmm/internals/states.pyc in _messages_backwards(trans_matrix, log_likelihoods)
    169         betal = np.zeros((T,M))
    170 
--> 171         scipy.weave.inline(hmm_messages_backwards_codestr,['AT','betal','aBl','T','M'],
    172                 headers=['<Eigen/Core>'],include_dirs=[eigen_path],
    173                 extra_compile_args=['-O3','-DNDEBUG'])
NameError: global name 'hmm_messages_backwards_codestr' is not defined

Do you have any idea why the parallel implementations are taking longer or where the bottleneck could be? I think it is perhaps better not to merge yet, just in case there is another way of making this better. All the best. Inti

inti commented 11 years ago

The above error seems to do with namespace issues with iPython parallel. it it snot solved by adding

@lbv.parallel(block=True)
@interactive
def rss(s):
    global hmm_messages_backwards_codestr, eigen_path
    s.resample()
    return s

to the parallel.py Inti

inti commented 11 years ago

defining the function like

@lbv.parallel(block=True)
@interactive
def rss(s):
    global global_model
    global_model.states_list[0].resample()
    #s.resample()
    return global_model.states_list[0]

(using index 0 just for testing)

also spits out the same error

[3:apply]: 
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)<string> in <module>()
/home/inti/My_Soft/pyhsmm/parallel.py in rss(s)
     56 def rss(s):
     57     global global_model
---> 58     global_model.states_list[0].resample()
     59     #s.resample()
     60     return global_model.states_list[0]
/home/inti/My_Soft/pyhsmm/internals/states.pyc in resample(self)
    103 
    104     def resample(self):
--> 105         betal = self.messages_backwards()
    106         self.sample_forwards(betal)
    107 
/home/inti/My_Soft/pyhsmm/internals/states.pyc in messages_backwards(self)
     81 
     82     def messages_backwards(self):
---> 83         return self._messages_backwards(self.model.trans_distn.A,self.aBl)
     84 
     85     @staticmethod
/home/inti/My_Soft/pyhsmm/internals/states.pyc in _messages_backwards(trans_matrix, log_likelihoods)
    169         betal = np.zeros((T,M))
    170 
--> 171         scipy.weave.inline(hmm_messages_backwards_codestr,['AT','betal','aBl','T','M'],
    172                 headers=['<Eigen/Core>'],include_dirs=[eigen_path],
    173                 extra_compile_args=['-O3','-DNDEBUG'])
NameError: global name 'hmm_messages_backwards_codestr' is not defined

I am not sure whether this strategy would be faster or not, is your expectation that will not?

I am trying to approach the problem in different ways ... I just cannot get why if one splits the states re sampling across engines it is slower than doing it sequentially on a single engine.

Inti

mattjj commented 11 years ago

I think this might be a bug I introduced in the rewrite on Saturday. Basically, the string is no longer copied into the states instances, and therefore not pushed along when the global model is distributed.

I think I know a quick fix. Hold on and I'll put it into mattjj/master...

On Apr 17, 2013, at 10:22 AM, Inti Pedroso notifications@github.com wrote:

defining the function like

@lbv.parallel(block=True) @interactive def rss(s):

global global_model

global_model.states_list[0].resample()

s.resample()

return global_model.states_list[0](using index 0 just for testing)

also spits out the same error

NameError Traceback (most recent call last) in () /home/inti/My_Soft/pyhsmm/parallel.py in rss(s)

56 def rss(s):

57 global global_model ---> 58 global_model.states_list[0].resample()

59 #s.resample()

60 return global_model.states_list[0] /home/inti/My_Soft/pyhsmm/internals/states.pyc in resample(self)

103

104 def resample(self): --> 105 betal = self.messages_backwards()

106 self.sample_forwards(betal)

107

/home/inti/My_Soft/pyhsmm/internals/states.pyc in messages_backwards(self)

81

82 def messages_backwards(self): ---> 83 return self._messages_backwards(self.model.trans_distn.A,self.aBl)

84

85 @staticmethod /home/inti/My_Soft/pyhsmm/internals/states.pyc in _messages_backwards(trans_matrix, log_likelihoods)

169 betal = np.zeros((T,M))

170

--> 171 scipy.weave.inline(hmm_messages_backwards_codestr,['AT','betal','aBl','T','M'],

172 headers=['<Eigen/Core>'],include_dirs=[eigen_path],

173 extra_compile_args=['-O3','-DNDEBUG']) NameError: global name 'hmm_messages_backwards_codestr' is not defined I am not sure whether this strategy would be faster or not, is your expectation that will not?

I am trying to approach the problem in different ways ... I just cannot get why if one splits the states re sampling across engines it is slower than doing it sequentially on a single engine.

Inti

— Reply to this email directly or view it on GitHub.

mattjj commented 11 years ago

Hmm, actually there's another issue to worry about: the eigen_path variable for the engine processes needs to be set correctly (I don't remember how I did it before or whether it was even done right; maybe my tests were just getting lucky). I have some stuff to this morning so I can't fix it right away; maybe I can fix it this afternoon or tonight. Sorry!

For the time being you might be able to work on a version from before Saturday afternoon.

On Apr 17, 2013, at 10:34 AM, Matthew Johnson mattjj@mit.edu wrote:

I think this might be a bug I introduced in the rewrite on Saturday. Basically, the string is no longer copied into the states instances, and therefore not pushed along when the global model is distributed.

I think I know a quick fix. Hold on and I'll put it into mattjj/master...

On Apr 17, 2013, at 10:22 AM, Inti Pedroso notifications@github.com wrote:

defining the function like

@lbv.parallel(block=True) @interactive def rss(s):

global global_model

global_model.states_list[0].resample()

s.resample()

return global_model.states_list[0](using index 0 just for testing)

also spits out the same error

NameError Traceback (most recent call last) in () /home/inti/My_Soft/pyhsmm/parallel.py in rss(s)

56 def rss(s):

57 global global_model ---> 58 global_model.states_list[0].resample()

59 #s.resample()

60 return global_model.states_list[0] /home/inti/My_Soft/pyhsmm/internals/states.pyc in resample(self)

103

104 def resample(self): --> 105 betal = self.messages_backwards()

106 self.sample_forwards(betal)

107

/home/inti/My_Soft/pyhsmm/internals/states.pyc in messages_backwards(self)

81

82 def messages_backwards(self): ---> 83 return self._messages_backwards(self.model.trans_distn.A,self.aBl)

84

85 @staticmethod /home/inti/My_Soft/pyhsmm/internals/states.pyc in _messages_backwards(trans_matrix, log_likelihoods)

169 betal = np.zeros((T,M))

170

--> 171 scipy.weave.inline(hmm_messages_backwards_codestr,['AT','betal','aBl','T','M'],

172 headers=['<Eigen/Core>'],include_dirs=[eigen_path],

173 extra_compile_args=['-O3','-DNDEBUG']) NameError: global name 'hmm_messages_backwards_codestr' is not defined I am not sure whether this strategy would be faster or not, is your expectation that will not?

I am trying to approach the problem in different ways ... I just cannot get why if one splits the states re sampling across engines it is slower than doing it sequentially on a single engine.

Inti

— Reply to this email directly or view it on GitHub.

inti commented 11 years ago

Thanks for looking into this. I have reverted to Sat morning code. cheers. Inti

mattjj commented 11 years ago

I think 0344fa0, which is the new master branch head, might fix the issue, but I haven't had a chance to test it. Let me know if you can give it a shot.

I got rid of the use_eigen stuff, so you should remove those calls from your script files. To use the Eigen-backed classes, just instantiate a models.HMMEigen object (the same way you used to instantiate a models.HMM object). The examples/hmm.py file should make it more clear. That stuff also isn't very well tested at the moment, so let me know if anything blows up.

inti commented 11 years ago

cool!, i'll try an I'll let you know. thanks a lot Inti On 18 Apr 2013, at 19:31, Matthew Johnson notifications@github.com wrote:

I think 0344fa0, which is the new master branch head, might fix the issue, but I haven't had a chance to test it. Let me know if you can give it a shot.

I got rid of the use_eigen stuff, so you should remove those calls from your script files. To use the Eigen-backed classes, just instantiate a models.HMMEigen object (the same way you used to instantiate a models.HMM object). That stuff also isn't very well tested at the moment, so let me know if anything blows up.

— Reply to this email directly or view it on GitHub.

inti commented 11 years ago

Hi, I got this error when I tried to run something with the latest version

Traceback (most recent call last):
  File "cnv_calling.py", line 11, in <module>
    import pyhsmm
  File "/home/inti/My_Soft/pyhsmm/__init__.py", line 2, in <module>
    import models
  File "/home/inti/My_Soft/pyhsmm/models.py", line 8, in <module>
    import basic.distributions
  File "/home/inti/My_Soft/pyhsmm/basic/distributions.py", line 59, in <module>
    NegativeBinomialFixedRDuration = _start_at_one(NegativeBinomialFixedR)
NameError: name 'NegativeBinomialFixedR' is not defined

Let me know what you think.

BW, Inti PS: and have a nice weekend!

mattjj commented 11 years ago

Try running git submodule update --recursive

inti commented 11 years ago

that did the trick. carry on testing it ... thanks! On 19/04/13 16:17, Matthew Johnson wrote:

Try running |git submodule update --recursive|

— Reply to this email directly or view it on GitHub https://github.com/mattjj/pyhsmm/pull/9#issuecomment-16659287.

inti commented 11 years ago

i'll finish to test this on monday and let you know about more timing experiments. BW and 've a nice weekend. Inti On 19/04/13 16:17, Matthew Johnson wrote:

Try running |git submodule update --recursive|

— Reply to this email directly or view it on GitHub https://github.com/mattjj/pyhsmm/pull/9#issuecomment-16659287.

inti commented 11 years ago

Hi, I have now tested this, not timed it throughly though, and it seems to work fine. it would be great if you can merge now. In my quick timing it was a bit faster than using a single CPU but it seems to get better the more sequences one has. so I will test more with this to try to work our when it makes sense to use this approach. thanks a lot!

Inti

inti commented 11 years ago

hang on I found a minor problem

inti commented 11 years ago

Hi, I had a minor bug but it is OK know. Let me know what you think.

BW, Inti

mattjj commented 11 years ago

I just re-did the parallelism stuff so that this kind of thing should be easier and more robust. Are you still working on this?

inti commented 11 years ago

Hi, yes, still working on this but has somehow gone down on the priority list until a few weeks time.

mattjj commented 10 years ago

The code has changed a lot since these issues were active, so I'm closing them.