pseudo-Skye / TriAD

Welcome to the official repository for the paper "Unraveling the 'Anomaly' in Time Series Anomaly Detection: A Self-supervised Tri-domain Solution." This repository houses the implementation of the proposed solution, providing a self-supervised tri-domain approach for effective time series anomaly detection.
16 stars 2 forks source link

The possibility of using TriAD on other datasets (such as KPIs)? #5

Open issaccv opened 2 months ago

issaccv commented 2 months ago

Thank you for open sourcing such a great work. I noticed that your article showed the result on the UCR data set. Considering the particularity of the UCR data set: there is only one abnormal segment on one time series. How to use TriAD in other datasets containing multi-segment anomalies (e.g. KPI and Yahoo)?

Specifically, you used np.argmin and np.unique to find the most anomalous segment.

https://github.com/pseudo-Skye/TriAD/blob/c2a7c6e5b148ace9017272ad8790d8db34d9e7df/train.py#L176-L185

If I want to use TriAD in a wider range of scenarios, how should I treat the scores output by three domains? Is it possible to do the following:

normalized_score = 1 - (scores-scores.min(axis=1)) / (scores.max(axis=1)-scores.min(axis=1))
anomaly_score = np.sum(normalized_score, axis=1)
pseudo-Skye commented 2 months ago

Thank you for your interest in my work! TriAD is designed to find single, subtle anomalies in long time series, so it can't detect multiple anomalies at once. It uses MERLIN, which also has trouble with multiple similar anomalies. Though I wouldn't recommend, it is still applicable to test on datasets with multiple anomalies. You can split the test set into segments, each with a single anomaly, but this can be complex as you need to process the output to let MERLIN scan it.

Yahoo has some mislabeling issues, and KPI mostly suffers from 'one-liner' problem. I found debates in the community about whether we should still keep them in benchmarks for evaluation. For now, I haven't found any benchmarks as reliable as UCR, but I've added a new benchmark called UAD* (datasets with multiple anomalies + no flaws discussed before) in the final version of my paper. I think ICDE will release it later this year. If you don't want to go through the complex process of testing TriAD on datasets with multiple anomalies, you can check the evaluation results reported in my paper. Just email me if you want earlier access, and I'll drop you the final version.

To adjust the scores output by TriAD, you can feel free to give it a try. See if you sum up the normalized scores of each domain followed by setting a threshold would make it applicable to detect multiple anomalies. TriAD has a lot of room for improvement, but I won't be able to work on it further. Therefore It is more than welcoming for others to build on my model and make it better.