shenghua-liu / reading-group

ML, DM, big graph mining, time series mining, anomaly detection
5 stars 0 forks source link

MASA: Motif-Aware State Assignment in Noisy Time Series Data #4

Open tdingquan opened 5 years ago

tdingquan commented 5 years ago

Authors:

  • Saachi Jain - saachi@stanford.edu - Stanford University
  • David Hallac - hallac@stanford.edu - Stanford University
  • Rok Sosic - rok@stanford.edu - Stanford University
  • Jure Leskovec - jure@stanford.edu - Stanford University

Conference:

Keywords:

  • Noisy motif discovery
  • Temporal clustering
  • Multivariate time series

Tasks:

  • Cluster analysis
  • State definition
  • Motif discovery

Code:

Datasets: [TO BE EDITTED]

  1. Subjects cycling on an exercise bike, Daily and Sports Activities Data Set
  2. Aircrafts, commercial & classified
  3. Automobiles, commercial & classified

Reviewer:

  • Ding Quan
tdingquan commented 5 years ago

One Sentence Summary:

This paper uses unsupervised learning method to discover common motif and leverage the motifs to more robustly assign states to the system measurement, which may be regarded as one moment's of the whole large number of noisy time series data over a period of time.

The whole process can be seen as an optimization problem.

Problems:

  1. The individual states and state assignments are unknown.
  2. Durations of measurements assigned to one state is unknown.

Definitions:

  1. State: Represent a prototype of system behavior.
  2. Motif: Sequence of state assignments which correspond to complex behaviors that capture common sequence of state transitions, where all neighboring occureences of the same states are merged into one.

Input & Output:

  • Input: A sequence of T measurements, where each measurement is a vector of data values observed at time t.

  • Output: Sequence of States corresponding to the measurements and Motif Patterns hidden in the state.

Models:

  1. State model Θ
  2. Assignment of states to measurements S
  3. Assignment of motifs to measurements M
  4. Optimize: TIM截图20190903183307

Methods & Process:

  1. Expectation Maximization Type Approach:

    Initialize: Initialize the state model Θ E-Step: Assign state to measurements, discover motifs and update state assignments

    E-Step A: Discover Candidate Motifs TIM截图20190904172925 E-Step B: Using Motifs to Assign States: TIM截图20190904173028 M-Step: Update state probability model Θ with the updated state assignment

  2. Viterby Alogorithm, to solve the state assignment problem
  3. Toeplitz Inverse Covariance-based Clustering (TICC) model, to define each state
  4. Hidden Markov model, to model each motif separately
  5. Suffix array, to find the maximal subsequence, which is defined as a sequence that cannot be extended to either the left or right without changing the set of occurrences in S.
tdingquan commented 5 years ago

Problem 1:

During the process of E-Step A, the discorvery of candidates motif, we use the null model to discriminate the redundancies. I believe a bracket was missed from the corresponding formula. What should the formula look like?

TIM截图20190904173941

Especially the B, what does this mean? Is this an idiom?