mattjj / pyhsmm

MIT License
547 stars 173 forks source link

Multiple chains #11

Open inti opened 11 years ago

inti commented 11 years ago

Hi, Would it make sense to have some sort of multi-chain class that could wrap multiple models (potentially running in parallel) and check convergence of the parameter estimates? I think it may be useful to be able to code something like

model.resample_model(chains = 4, check_convergence = 100)
model.summary_parameters()

to run 4 chain and check convergence every 100 iterations and then summarize the parameters estimates with the samples from all chains.

Do you have some sort of code that does this, even if rudimentary? I can try to wrap it up.

BW, Inti

mattjj commented 11 years ago

That is a great idea. Unfortunately, I do not have any code for that! I have some thoughts below, though.

You can do things "by hand". Since the "model" object contains the state of a sampler chain, it would make sense to create multiple model objects (a copy for each chain) and just call resample_model on each separately. You could use copy.deepcopy to construct copies of the model easily, but that would copy the data as well (minor overhead).

As far as I know there's no one good way to check convergence, so you'd probably want to implement a heuristic that works for you outside of the model instance as well. The same thing might apply to summarizing parameters, though one might also add some kind of self-summarizing functionality to distribution objects. In fact, distribution objects may be able to implement their own convergence testing functionality that a model could use...

Basically, my thought is that this kind of functionality would be great, and it should sit "outside" the model objects (though this demonstrates a way in which the "model" name isn't a perfect fit).