ML4ITS / mtad-gat-pytorch

PyTorch implementation of MTAD-GAT (Multivariate Time-Series Anomaly Detection via Graph Attention Networks) by Zhao et. al (2020, https://arxiv.org/abs/2009.02040).
MIT License
328 stars 76 forks source link

about adjust_predicts() ,please!!! #40

Closed fffii closed 4 months ago

fffii commented 7 months ago

First,thanks for making this repo public, and I have learned a lot from the issues, thanks for your reply. I have seen many times about that:

for i in range(len(predict)):
    if any(actual[max(i, 0) : i + 1]) and predict[i] and not anomaly_state:
        anomaly_state = True
        anomaly_count += 1
        for j in range(i, 0, -1):
            if not actual[j]:
                break
            else:
                if not predict[j]:
                    predict[j] = True
                    latency += 1
    elif not actual[i]:
        anomaly_state = False
    if anomaly_state:
        predict[i] = True

It's part of the "adjust_predicts", I am very curious, what is the purpose? And the how does the "latency" work out?

srigas commented 6 months ago

Dear @fffii ,

The adjust_predicts() function essentially performs what is known in Anomaly Detection literature as the "Point Adjustment" strategy. You can read more about it for example here, but, in a nutshell, point adjustment is based on the following assumption:

In a real-world scenario, we are mainly interested in continuous anomalous segments rather than point anomalies. A good anomaly detector should be able to identify such events, without necessarily identifying all individual anomalous instances that they contain. Therefore, as long as at least 1 anomaly has been identified within an event, we can consider the entire event as "identified".

This implies that, during model evaluation, all instances of an event are regarded as "correctly identified anomalies", as long as at least one of them was correctly identified by the model. To show an example of this, suppose that we have the following labels for a time-series' data points:

Ground Truth = [0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1]

This corresponds to a time-series with 4 events: 2 point anomalies (near the start and at the end), as well as 2 continuous segments (one with 10 points and one with 3 points). Now let's suppose that our model identified the following labels:

Model Results = [0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1]

Our model correctly identifies the first and last event, i.e. the point anomalies. However, it wrongly identifies one point-anomaly at index No. 4 and completely misses the continuous segment that consists of 3 points. Nonetheless, it identifies some of the anomalies in the continuous segment that consists of 10 points. The corresponding point-adjusted labels are:

PA Results = [0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1]

What happened here? The wrongly identified point anomaly at index No. 4 remains there. Additionally, the unidentified continuous segment that consists of 3 points remains unidentified. However, the continuous segment that consists of 10 points is considered to be fully identified, simply because 3 points of it were successfully identified by the model.

My personal view on the matter is that Point Adjustment is a way of presenting very high Accuracies and F1-Scores, by hiding under the rug the truth about the actual capabilities of a model. This is why it has been heavily criticized the past few years and alternatives have been suggested in the literature.

Finally, as far as latency is concerned, this is simply "the time it took the model to identify an event". In the previous example, the model's latency in identifying the point anomalies is zero, because it correctly identified both of them. For the continuous segment of 3 anomalies, the latency is infinite, because the event was never identified. Finally, for the continuous segment of 10 anomalies, the latency is 4 timestamps, because the first 4 anomalies of the event were missed by the model.

fffii commented 5 months ago

Dear @fffii ,

The adjust_predicts() function essentially performs what is known in Anomaly Detection literature as the "Point Adjustment" strategy. You can read more about it for example here, but, in a nutshell, point adjustment is based on the following assumption:

In a real-world scenario, we are mainly interested in continuous anomalous segments rather than point anomalies. A good anomaly detector should be able to identify such events, without necessarily identifying all individual anomalous instances that they contain. Therefore, as long as at least 1 anomaly has been identified within an event, we can consider the entire event as "identified".

This implies that, during model evaluation, all instances of an event are regarded as "correctly identified anomalies", as long as at least one of them was correctly identified by the model. To show an example of this, suppose that we have the following labels for a time-series' data points:

Ground Truth = [0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1]

This corresponds to a time-series with 4 events: 2 point anomalies (near the start and at the end), as well as 2 continuous segments (one with 10 points and one with 3 points). Now let's suppose that our model identified the following labels:

Model Results = [0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1]

Our model correctly identifies the first and last event, i.e. the point anomalies. However, it wrongly identifies one point-anomaly at index No. 4 and completely misses the continuous segment that consists of 3 points. Nonetheless, it identifies some of the anomalies in the continuous segment that consists of 10 points. The corresponding point-adjusted labels are:

PA Results = [0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1]

What happened here? The wrongly identified point anomaly at index No. 4 remains there. Additionally, the unidentified continuous segment that consists of 3 points remains unidentified. However, the continuous segment that consists of 10 points is considered to be fully identified, simply because 3 points of it were successfully identified by the model.

My personal view on the matter is that Point Adjustment is a way of presenting very high Accuracies and F1-Scores, by hiding under the rug the truth about the actual capabilities of a model. This is why it has been heavily criticized the past few years and alternatives have been suggested in the literature.

Finally, as far as latency is concerned, this is simply "the time it took the model to identify an event". In the previous example, the model's latency in identifying the point anomalies is zero, because it correctly identified both of them. For the continuous segment of 3 anomalies, the latency is infinite, because the event was never identified. Finally, for the continuous segment of 10 anomalies, the latency is 4 timestamps, because the first 4 anomalies of the event were missed by the model.

Thanks! It's very helpful to me, have a good day !