kaist-dmlab / DualTF

6 stars 1 forks source link

Breaking the Time-Frequency Granularity Discrepancy in Time-Series Anomaly Detection

This is the implementation of ARCUS published in TheWebConf 2024 [paper]

1. Overview

In light of the remarkable advancements made in time-series anomaly detection (TSAD), recent emphasis has been placed on exploiting the frequency domain as well as the time domain to address the difficulties in precisely detecting pattern-wise anomalies. However, in terms of anomaly scores, the window granularity of the frequency domain is inherently distinct from the data-point granularity of the time domain. Owing to this discrepancy, the anomaly information in the frequency domain has not been utilized to its full potential for TSAD. In this paper, we propose a TSAD framework, Dual-TF, that simultaneously uses both the time and frequency domains while breaking the time-frequency granularity discrepancy. To this end, our framework employs nested-sliding windows, with the outer and inner windows responsible for the time and frequency domains, respectively, and aligns the anomaly scores of the two domains. As a result of the high resolution of the aligned scores, the boundaries of pattern-based anomalies can be identified more precisely. In six benchmark datasets, our framework outperforms state-of-the-art methods by 12.0–147%, as demonstrated by experimental results.

2. Public Datasets

Name # Applications # Train # Test Entity×Dimension # Point Anomaly (Ratio) # Pattern Anomaly (Ratio) Source
TODS(Point) Synthetic 20,000 5,000 2 × 1 250 (100%) 0 (0%) link
TODS(Pattern) Synthetic 20,000 5,000 3 × 1 0 (0%) 250 (100%) link
ASD Server Monitoring 8,527 4,320 12 × 19 0 (0%) 199 (100%) link
ECG Medical Checkup 6,995 2,851 9 × 2 0 (0%) 208 (100%) link
PSM Server Monitoring 132,481 87,841 1 × 25 16 (0.07%) 24,365 (99.93%) link
Company A Server Monitoring 21,600 13,302 3 × 8 10 (8.53%) 104 (91.47%) Private

3. Requirements and Installations

4. Configuration

Dual-TF was implemented in Python 3.8.12.

5. How to run

--input_c: the datasets' dimension (integer) --output_c: the datasets' dimension (integer) --form: the data form for TODS dataset (string, e.g. point, context, shaplet, seasonal, trend) --data_num: the number of dataset. (integer)


- At current directory which has all source codes, run main.py with parameters as follows.