Other observation models besides Gaussian

mathDR commented 9 years ago

Hi. I was wondering if you had any insight in extending your code to include other emission models besides gaussian. In particular, how about a GMM with known number of gaussians?

I was going to take a stab at implementing it and submit a PR, but wanted to get your input first.

Thanks

Dan

hildensia commented 9 years ago

I also thought about that. I'd find that most useful, but haven't had time to implement it yet. So feel free to do a PR. I'd also like a (arbitrary) function plus Gaussian as model.

mathDR commented 9 years ago

Can you speak a bit about your gaussian_obs_log_likelihood function? Why aren't you using MLE for the log likelihood? (Or are you, and I am missing something)?

mathDR commented 9 years ago

So it looks like for a given segment, you compute the MLE parameters, then calculate and return the likelihood of the segment with those parameters correct?

hildensia commented 9 years ago

That sounds right, but I used quite some numerical foo, so everything is obfuscated and I don't remember the details right now. I'm at the ICRA (a conference) right now so I'm a bit busy. I'll see through it when I'm back. Maybe I find some spare time this week already.

mathDR commented 9 years ago

So it looks like you implemented an independent features model. Maybe the next thing would be to do a full covariance model (like in Xuan et. al) then tackle the GMM with either independent features or full covariance. Thoughts?

hildensia commented 9 years ago

Yes! That sounds very reasonable. It should be straight forward to implement. Simply implement the three lines of math in Xuan et al. section 3.2. There is already a gamma function in scipy (scipy.special.gamma). It'll probably make sense to implement everything in logspace again, as the probabilities will be very small. (Again use scipy.special.gammaln.) I guess scipy.stats.wishart is not needed, from what I have seen in the paper.

I'd first implement it naively and then try to optimize it. The independent model is very similar, maybe you can reuse parts of it.

mathDR commented 9 years ago

This is exactly what I am doing. Unfortunately, the equations assume zero means for each segment, so I will also have to model the multivariate means using multivariate linear regression. This is discussed in section 3.4 of Xuan.

hildensia commented 9 years ago

True. But I'd suggest to implement 3.2 first and when it works add the non-zero means. (They even say in the paper, that it's easy to update.)

mathDR commented 9 years ago

Okay, I have the IFM and full covariance (both with zero mean assumption) implemented and examples for both.

mathDR commented 9 years ago

Upon looking at section 3.4 of the Xuan paper, I am drawing a blank as to what the default mean basis should be (the H in the paper). I am leaning towards 1, but then I don't know what D then becomes. Any thoughts?

hildensia commented 9 years ago

First: could you make a pull request? I'd like to add it as soon as possible. Maybe we should also discuss if the interface for the whole thing is nice as it is (see https://github.com/hildensia/bayesian_changepoint_detection/issues/4) and to write tests for everything. I suppose that would make it much more useful and reliable for others.

To your question: I read through the paper again and try to understand it again :) I come back to you in a minute. It might be worth a thought to contact Kevin Murphy or Xiang Xuan and ask them for reasonable defaults. Most often paper authors are very happy about questions.

hildensia commented 9 years ago

Also note, there is a master thesis from Xiang Xuan about the same topic (I guess it's a longer version of the ICML paper). http://www.cs.ubc.ca/~murphyk/Students/Xuan_MSc07.pdf

hildensia commented 9 years ago

Ok, I'm not 100% sure, but what I got from reading:

H is to be chosen from the user. There are two "common basis functions" in the master thesis I link to above (polynomial basis and autoregressive basis). It defines the features of the linear regression. (as you would do in plain normal linear regression).
D seems independent of H and is the diagonal matrix of the variances of the regression parameters. (which makes it the covariance matrix of the parameters, which are independent from each other).

So both shouldn't have default values in the first place, but should be chosen by the user. For H we could implement the two examples from the MSc thesis (they are easy). For D D = np.eye() should maybe do it for a start.

The technical report from Tom Minka cited by the thesis also gives some insights (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.39.4002&rep=rep1&type=pdf).

mathDR commented 9 years ago

Thanks for the links. A few things:

Do you want me to fork the repo and do my PR there? (Otherwise, you will have to add me to the repository so I can submit a new branch and submit the PR)
I had been trying to comport to your existing code, but I will resort to my tested version. Note: there is a bit of a new folder structure (having examples in one directory, observation models in another, etc.)
Adding linear regression for the mean implies having a basis set being able to be passed in for use. I am looking at (maybe) an observation function class that can set parameters, basis, functions, etc. I will do that today.
Think a bit about what you want in terms of tests. The observation model testing is easy for verification, but validation might be tough.

hildensia commented 9 years ago

1) If it's easier for you, I can simply add you to the collaborators. Otherwise the fork-and-pr way seems to be the usual way on github.

2) That's all fine. I'm not quite happy with the current structure and API as I said before. It just happened to end up as it is, because of the particular project I was coding for.

4) I really thought more about unit tests. Formal validation is always hard and a lot work. I think that is not really needed here.

hildensia commented 9 years ago

https://github.com/hildensia/bayesian_changepoint_detection/pull/5 closes this.

hildensia / bayesian_changepoint_detection

Other observation models besides Gaussian #2