dawnvince / EasyTSAD

A framework for easy running and evaluating your TSAD algorithm.
GNU General Public License v3.0
87 stars 20 forks source link

Why extend the anomaly time steps using margin? #8

Closed FeiGSSS closed 2 weeks ago

FeiGSSS commented 3 weeks ago

Dear authors,

I noticed that you extend the anomaly periods both before and after the actual anomaly labels in the dataset by using the margin parameter. While I understand that this might help capture transitions around anomalies, I am curious about the validity of this approach. Wouldn't it simplify the anomaly detection task by making it easier for the model to predict anomalies? Additionally, how is the length of the extension determined for each dataset?

Thank you!

FeiGSSS commented 3 weeks ago

I found the answer in the paper:

Besides, most statistical and deep learning methods generate anomaly scores under prediction-based or reconstruction-based frameworks, both heavily relying on the pattern provided by previous time windows. Hence, the recently concluded anomaly segment, particularly those with longer durations, has the potential to heavily interfere with the subsequent detection process. As shown in Fig. 5, from a qualitative perspective, we observe that the method accurately detects the frequency anomaly event. However, some unexpected false positives emerge due to the aforementioned reasons, resulting in an underestimated evaluation. We slightly extended the anomaly segments (less than 10 time points) to tolerate such occurrences, to make the evaluation more in line with our intuitive comprehension. We carefully handle edge cases to avoid merging two anomaly segments during the operation.

dawnvince commented 2 weeks ago

Yes, we aim to make the evaluation results align with empirical intuition through this approach. Since different datasets have different characteristics, we currently specify the margin value manually. We follow two principles: First, the margin value cannot be too large, typically less than half a period. Second, since pattern-wise anomalies have a larger impact range and higher detection delay, we use a larger margin for datasets with more pattern-wise anomalies, like UCR.

If you have any other questions, feel free to contact us at any time~