cvxgrp / GGS

Greedy Gaussian Segmentation
BSD 2-Clause "Simplified" License
96 stars 34 forks source link

breakpoint indices confusion #2

Closed SkynetIsACat closed 6 years ago

SkynetIsACat commented 6 years ago

First of, thanks a lot for sharing the code.

I got a small question regarding the implementation. So in the paper it is stated that the K breakpoints lie between 1 and T+1, where b_0 and b_K+1 are equal to 1 and T+1, respectively. In the implementation it seems like you initialise k_0 to 0 while keeping K+1 to T+1, shouldn't it become T when we start from 0?

Related to this it's not entirely clear to me what the numbers of the breakpoints represent, since the returned breakpoints do not match the indices of x_t, so is it correct that b_0 represents the index of x_1 and following that, x_T < b_K+1. In other words, can the returned breakpoints with exception of the last be used as indices for the columns of the input matrix?

davidhallac commented 6 years ago

Great question! Yes, you are correct in assuming this in part has to do with the 0-indexing of Python (as compared to our paper, where the data is 1-indexed). b_0 is equivalent to x1, and b{K+1} = x_T+1

Essentially, each breakpoint is the index in the Python matrix of the "first" element of a new segment. The last breakpoint is essentially a "non-existent" segment, since there are only K segments but K+1 breakpoints, which is why it goes one past the number of columns in the data.

Hope that clears things up, and let me know if you have any other questions!

SkynetIsACat commented 6 years ago

Thanks a lot for the quick answer. That clears things up for me yep!

davidhallac commented 6 years ago

Perfect, glad that helps!