ruiking04 / COCA

Deep Contrastive One-Class Time Series Anomaly Detection
30 stars 9 forks source link

Results Files #21

Closed marciahon29 closed 9 months ago

marciahon29 commented 9 months ago

What is the difference between log.txt and UCR_summary.csv?

Please could you explain the values in UCR_summary.csv?

Why are the Precision/Recall different from each other (in the log.txt and UCR_summary.csv files)?

Is the "F1" value the same as "Affiliation F1" (Table 2)?

In UCR_summary, what do the numbers next to the model mean? For example: DAGMM_0.7066189827640826

Thanks, Marcia

ruiking04 commented 9 months ago

This file is the result of calling Salesforce-Merlion. Let me explain briefly. There are many time-series anomaly detection metrics, including PW, PA, and RPA. F1 in the UCR_summary.csv file refers to RPA F1. image

For a detailed explanation of each metric, you can read their paper: You can refer to Salesforce-Merlion's paper: https://arxiv.org/abs/2109.09265

Affiliation F1 is another evaluation metric. We refer to the paper: https://arxiv.org/abs/2206.13167 Although our paper used the "Affiliation F1" metrics, our recent experiments found that this metric also has some problems and will overestimate the model performance. So I recommend you use RPA metrics as indicators.

Why not use PA? You can refer to this paper. https://ojs.aaai.org/index.php/AAAI/article/view/20680 It is described in detail that PA will overestimate model performance.

The number in "DAGMM_0.7066189827640826" is a random number to prevent duplication. https://github.com/ruiking04/COCA/blob/main/baseline.py image

ruiking04 commented 9 months ago

Precision and recall are somewhat contradictory indicators, so F1 is generally used. To give a simple example: we regard all samples as anomalies, the recall rate is 1, but the precision rate is very low. The precision and recall rates are very different, indicating that the performance of the model itself is poor.

marciahon29 commented 9 months ago

Just to confirm. The results for COCA (accuracy, precision, recall, f1) are at the end as in the following that is in quotes:


{'accuracy': 1.0}

UCR metrics: accuracy: {'accuracy': 1.0}

affiliation metrics: Precision: 0.97244 Recall: 0.66970 f1: 0.79316

Revised-point-adjusted metrics: F1 score: 0.94118 Precision: 0.94118 Recall: 0.94118 Point-adjusted metrics: F1 score: 0.97425 Precision: 0.99990 Recall: 0.94987 NAB Scores: NAB Score (balanced): 0.80292 NAB Score (high precision): 0.80270 NAB Score (high recall): 0.84900 seed: 1 config setup:

dataset:UCR input_channels:1 kernel_size:8 stride:1 final_out_channels:64 hidden_size:128 num_layers:3 project_channels:32 dropout:0.45 features_len:18 window_size:64 time_step:4 num_epoch:50 freeze_length_epoch:10 change_center_epoch:10 center_eps:0.1 omega1:1 omega2:0.1 beta1:0.9 beta2:0.99 lr:0.0003 drop_last:False batch_size:512 nu:0.01 detect_nu:0.0005 threshold_determine:one-anomaly objective:one-class loss_type:distance augmentation:<conf.coca.UCR_Configs.augmentations object at 0x2aab7f0d7730> scale_ratio:0.2 jitter_ratio:0.3 Training time is : 0:10:15.471088

marciahon29 commented 9 months ago

Thank you, I understand now.