deepcharles / ruptures

ruptures: change point detection in Python
BSD 2-Clause "Simplified" License
1.64k stars 161 forks source link

Clarification on Breakpoint Indexing with Dataframe Input #313

Closed tg12 closed 1 year ago

tg12 commented 1 year ago

I am currently working with a system that utilizes breakpoints, potentially sourced from a pandas DataFrame, and I have encountered a point of confusion regarding the indexing of these breakpoints.

Could you please clarify whether the breakpoints are 0-indexed? Specifically, when breakpoints are loaded from a list obtained from a DataFrame, does the indexing follow the standard Python convention, where the first element is at index 0?

For example, if I have a DataFrame with n rows and I convert it to a list of breakpoints, should I expect that the last breakpoint corresponds to the last element at index n-1, following zero-based indexing?

Understanding this will ensure proper handling of edge cases when the last breakpoint is accessed or modified.

deepcharles commented 1 year ago

Hi,

Everything is 0-indexed. For instance, for a numpy array signal of shape (n_samples, n_dims) or (n_samples,), if the change-point list is [10, 45, 89, 100], this means that signal[0:10], signal[10:45], signal[45:89] and signal[89:100] are segments. Also, the last element of a change-point list is the number of samples, i.e. n_samples=100 here.

Hope this helps

tg12 commented 1 year ago

Perfect thank you!