Design metric for evaluating time series anomaly detection

MSRDL / Deep4Cast

Probabilistic Multivariate Time Series Forecast using Deep Learning

BSD 3-Clause "New" or "Revised" License

94 stars 22 forks source link

Closed zer0n closed 6 years ago

zer0n commented 6 years ago

satyanshukla commented 6 years ago

Evaluation Metrics for different cases in anomaly detection:

Point-wise Labeling: Area under ROC curve, Precision and Recall
Region-wise Labeling:
- Mean Average Precision (mAP) : same as used in Object Detection
- Weighted mAP: In order to give more weightage to early detection (using a sigmoid like function)
Fixed Length Intervals: Area under ROC Curve

zer0n commented 6 years ago

On point-wise labeling: if we predict an anomaly 1 time step earlier or later, would that be counted as a misclassification? That doesn't seem right.

On fixed-length interval: I don't understand the AUC metric. What would it mean to be a false positive or a false negative?

On weighted mAP: can you provide the mathematical formulation for the metric? In particular, I'm curious how you apply the Sigmoid function.

satyanshukla commented 6 years ago

Regarding pointwise labeling, we would have some kind of confidence (could be probability or anomaly score) at each point of time series and then we can calculate the area under ROC score by varying the confidence threshold.
Fixed-interval labeling would be similar to above (1) in a way that here there would be an anomaly score for each interval and then by varying the threshold we can calculate AUC. False positive would be when non-anomaly windows are classified as an anomaly. We can treat each interval as a point and do classification. This makes more sense as compared to pointwise labeling.
Weighted mAP: Weighted_mAP.pdf