@tdhock this is the outline of my new paper, can you give me some feedbacks

Learning Penalty Parameters for Optimal Partitioning via Automatic Feature Extraction

Abstract

Changepoint detection is a technique used to identify significant shifts in data sequences, which is crucial in various fields such as finance, genomics, and medicine. The Optimal Partitioning (OPART) algorithm locates these changes within a sequence and uses a penalty parameter to control the number of detected changepoints. Traditionally, methods involved manually extracting statistical features from sequences to form feature vectors for predictive models that estimate the penalty value. This study introduces a novel approach that learns the penalty parameter directly from sequences by utilizing recurrent architecture networks to automatically extract relevant features that aid in determining the penalty.

Introduction

Introduce the concept of changepoint detection and its applications.
Note that the OPART algorithm operates with a fixed penalty parameter.
Highlight that prior studies have concentrated on predicting the penalty value using statistical features derived from sequences.
A significant limitation is that manually extracted features may not always be relevant for predicting the penalty parameter. Researchers have attempted to mitigate this by generating extensive feature sets through transformations such as logarithm, log-log, absolute value, square, and square root. Linear model apply L1 regularization to diminish the influence of irrelevant features, while tree-based methods can handle this automatically by constraining hyperparameters like tree depth or the minimum number of samples required to split.
This study employs recurrent architecture networks to extract a fixed number of relevant features for penalty prediction, rather than depending on a static feature vector derived from sequences. This approach enhances flexibility and adaptability, enabling the model to better capture data nuances and improve the accuracy of penalty predictions.
The most commonly utilized RNN architectures for univariate sequence data are Vanilla RNNs, Long Short-Term Memory (LSTM) networks, and Gated Recurrent Units (GRU).

Novelty

A diagram will illustrate the two steps involved in predicting the penalty from sequences: Step 1 focuses on feature extraction, while Step 2 involves using a predictive model to learn the penalty from the extracted features.
Instead of treating feature extraction and penalty learning as separate stages, this study integrates them to directly learn the penalty from raw sequences.
The rationale is that the relevant features from the sequences for predicting the penalty are often unknown. Recurrent networks can automatically extract valuable hidden features that contribute to penalty prediction.

Experiments

Base architectures: RNN, LSTM, and GRU.
Configuration: 1 or 2 layers, with hidden sizes of 2, 4, 8, or 16.

lamtung16 / ML_ChangepointDetection

Paper Revision #10

Learning Penalty Parameters for Optimal Partitioning via Automatic Feature Extraction

Abstract

Introduction

Novelty

Experiments