ruipgil / changepy

Time series changepoint detection
MIT License
102 stars 28 forks source link

cost function of normal mean #4

Closed JinnyCC closed 6 years ago

JinnyCC commented 6 years ago

I am writing to ask about the cost function of your change point detection algorithm. I compared the performance of the pelt function of changepy and cpt function of changepoint in R. Using pelt(normalmean(mydata, var), len(mydata)) and cpt.mean(tmp_data,penalty = "SIC",method="PELT") and find they has different result. Taking a closer look of the code I find there are difference between the cost function of mean.norm in changepoint package and normal_mean in changepy. The cost function of normal_mean requires a external input of variance which I think is just the variance of the whole data. This variance act like a constant to be divided each time the cost is computed, which is the point I don't understand. Shouldn't the the variance computed separately for different segment position as designed in the changepoint package in R. I am not sure whether this is the cause of the difference in result and that's why I write to ask you about it. Could you provide any insight regarding this part?

JinnyCC commented 6 years ago

I find the problem according to the ref(Haynes, Kaylea, Idris A. Eckley, and Paul Fearnhead. "Efficient penalty search for multiple changepoint problems." arXiv preprint arXiv:1412.3617 (2014).), the format of cost function is a square error cost, however, in your code the cost function is written as the difference between the sum of absolute value minus the square of sum of value in the segment divided by the length of segment. After change the cost function to the square error cost, the result seems quite consistent the the cpt.mean function in changepoint package. In addition, I think divide the cost to the square of an external input variance is useless since the value of variance is a constant and does not effect the cost.

ruipgil commented 6 years ago

@JinnyCC Can you please create a PR with the correct version?

JinnyCC commented 6 years ago

@ruipgil PR has been created. https://github.com/ruipgil/changepy/pull/5