lamtung16 / ML_ChangepointDetection

0 stars 0 forks source link

Summary of paper: https://academic.oup.com/jrsssb/article/86/2/273/7517020 #11

Open lamtung16 opened 3 weeks ago

lamtung16 commented 3 weeks ago

@tdhock This paper on changepoint detection using deep learning is quite popular and relatively recent. The reviewer from NeurIPS also referenced this paper and suggested that I incorporate this method into my study. I’ve summarized it and have some thoughts on it. If you have time, please take a look and give me some comments.

Problem Setup

Given:

  1. A sequence of length $$n$$: $$[x_1, x_2, ..., x_n]$$.
  2. A classifier $$\phi$$ that takes as input a segment of length $$m$$ from the sequence. It outputs $$\phi(xi, ..., x{i+m-1}) \rightarrow$$ {0, 1}, where $$0$$ indicates "no change" and $$1$$ indicates "one change."

    It's their novel contribution: a classifier designed to determine whether a segment contains a single changepoint or none. The classifier’s architecture is a multi-layer perceptron (input size is m, output size is binary) using ReLU activation functions.

  3. A threshold parameter $$\gamma$$, where $$0 < \gamma < 1$$.

Algorithm:

  1. For each window $$[xi, ..., x{i+m-1}]$$, with $$i$$ ranging from $$1$$ to $$n - m + 1$$, calculate $$L_i = \phi(xi, ..., x{i+m-1})$$
  2. For each potential changepoint index $$j$$, ranging from $$m$$ to $$n - m + 1$$, calculate the moving average: $$\hat Lj = \frac{1}{m} * \sum{k = j - m + 1}^{j} L_j$$
  3. Identify maximal length segments $$[s_k, e_k]$$ where all values in $$\hat L_j \geq \gamma$$ for $$j$$ in $$[s_k, ek]$$. For each such segment, define a changepoint at $$\arg\max{j}(\hat L_j)$$ for $$j$$ in $$[s_k, e_k]$$.

Example

Given:

  1. Sequence $$x = [1, 1, 1, 1, 2, 2, 2, 2, 1, 1, 1, 1, 1] \in R^{13}$$ (hope the algorithm will detect changepoints at location 4 and 8)
  2. Trained classifier $$\phi(z_1, z_2, z_3) \rightarrow$$ {0, 1} (window size $$m = 3$$).
  3. Threshold $$\gamma = 0.5$$.

Algorithm Steps:

  1. For each window of size 3 (i.e., $$(xi, x{i+1}, x_{i+2})$$ where $$i$$ ranges from 1 to 11), compute $$\phi(xi, x{i+1}, x_{i+2})$$. This results in:

$$L = [L_1, L_2, L_3, L_4, L_5, L_6, L_7, L_8, L9, L{10}, L_{11}] = [0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0]$$

  1. For each possible changepoint $$j$$ from 3 to 11, compute the average $$\hat L_j$$ as:

$$\hat L = [\hat L_3, \hat L_4, \hat L_5, \hat L_6, \hat L_7, \hat L_8, \hat L9, \hat L{10}, \hat L_{11}] = [0.33, 0.67, 0.67, 0.33, 0.33, 0.67, 0.67, 0.33, 0]$$

  1. Identify segments $$[4, 5]$$ and $$[8, 9]$$ where $$\hat L_j \geq \gamma$$, suggesting detected changepoints at indices 4 and 8.

Future Plan

lamtung16 commented 3 weeks ago

@tdhock I've been considering the proposed methods, and I don't think Proposed Solution 2 is a good idea, as it is likely to detect two consecutive changepoints.

However, for Proposed Solution 1, in step 2 (prediction), I can select the window size based on sequence length (because the labels length are related to sequence length, see https://github.com/lamtung16/ML_Changepoint_Detection_Auto/blob/382ff2ceea26b0f442e36e1a90c906e71d5b5ca1/data/cancer/figures/analyze/length_vs_label_length.png). After apply the classifier for a sliding fixed-size window for every point in the sequence, we will have labels that cover the entire sequence, with no overlap between labels, and the union of the labeled regions will be equal to the entire sequence. Finally, apply LOPART with penalty $$\lambda=0$$ to detect changepoint locations.