hildensia / bayesian_changepoint_detection

Methods to get the probability of a changepoint in a time series.
MIT License
670 stars 213 forks source link

How to utilize R matrix to detect change points? #20

Closed mike-ocean closed 5 years ago

mike-ocean commented 5 years ago

In the current version of code, Nw=10; ax.plot(R[Nw,Nw:-1]) is used to exhibit the changpoints. Although it works fine, I am really confused about the moral behind it. I tried to plot the run length with maximum prob in each time step i.e. the y index of maximum prob in each x col, but the result showed the run length keeps going up... I also went back to Admas's paper but found nothing about change point indentification stuff (he just stop at R matrix)... I also tried to find Adams's MATLAB code, but the code seems to have been removed...

I am trying to use this method in my work, and I believe it's the best to fully understand it before any deployment. Any help will be appreciated and thanks a lot!

mike-ocean commented 5 years ago

I just find something in the Jupyter notebook

Because it's very hard to correctly evaluate a change after a single sample of a new distribution, we instead can "wait" for Nw samples and evalute the probability of a change happening Nw samples prior.

Silly me... I miss this before... and now I believe I understand why it works. But still, I'd like to ask: why it's very hard to correctly evaluate a change after a single sample of a new distribution? Is it because when receiving the first few samples after a new chang epoint, the probability is more "controled" by the predefined predictive distribution?

hildensia commented 5 years ago

A not mathy explanation is: If you just got one sample, and you want to match mean and std of a Gaussian, your guess is likely to be off. If you observed your data for a while your moment matching is more reliably.

But go and try it out. set Nw to 1 in the example. I think the likelihoods will just decrease a lot. You kind of see it in the grayscale plot. At the beginning of each segment there is more uncertainty about the length of the segment. (the white line is a bit thicker)

hildensia commented 5 years ago

remember: at no point in time you know that you are following a new distribution.

mike-ocean commented 5 years ago

Thanks! Your reply is very helpful!