mattjj / pyhsmm

MIT License
546 stars 173 forks source link

:question: extending pyhsmm observations #53

Closed zouhairm closed 8 years ago

zouhairm commented 8 years ago

:question: (not sure how to assign label, I think a project developer needs to)

Hi,

I am using pyhsmm for a research project (thanks for making this public, it's awesome!) and I wanted to use a slightly different observation distribution, or rather different prior on the observation distribution, namely a scaled inverse whishart or separated correlation/variance matrices (see http://www.themattsimpson.com/2012/08/20/prior-distributions-for-covariance-matrices-the-scaled-inverse-wishart-prior/) as a prior for the covariance matrices.

My questions is whether this can be done just by extending the Gaussian class in pybasicbayes (https://github.com/mattjj/pybasicbayes/blob/master/pybasicbayes/distributions/gaussian.py), or do I have to change things inside of the Gibbs sampler as well?

If extending the Gaussian class is enough, which functions need to be implemented/changed. Would mirroring IsotropicGaussian be enough despite the '# TODO collapsed, meanfield, max_likelihood' ?

Happy to make a pull request back into the project if I get it to work :)

mattjj commented 8 years ago

I haven't had a chance yet to read that pdf you linked (it looks like I would learn a lot from it!), but I'll try to answer your question as best I can without reading it.

You should probably extend a Gaussian class in pybasicbayes, if only just to organize the code in the way pyhsmm expects (and in a way that would be easy to make a PR to pybasicbayes!). I'm not sure if much of the machinery in the existing classes will be useful for your purposes, but by extending _GaussianBase you'd at least get plotting, log likelihood, and random variable generation.

As for what you'd need to implement, check out the abstractions.py file in pybasicbayes. The GibbsSampling class shows that you only need to implement a resample method on top of the requirements of its parent Distribution, which are just log_likelihood and rvs (which come in _GaussianBase). In general, the methods you need to implement are marked as abstractmethod.

Since IsotropicGaussian implements those, it supports the GibbsSampling interface, and so it might be an okay reference point. However, classes that I've written more recently tend to be better. The Regression class is a related class I've worked on more recently than any of the Gaussian classes. Maybe DiagonalGaussian is the most recent of the Gaussian ones, though Gaussian gets the most attention.

Once you support that interface, you can pass instances of your class in the obs_distns list and Gibbs sampling will work for any model in pyhsmm or pybasicbayes.

Glad you like pyhsmm, I hope you find it useful!