Open enthu-sh opened 5 years ago
The labels represent individual ranges labeled by humans.
NAB uses anomaly windows for scoring because the anomalies are temporal and can span a period of time. These windows are in combined_windows.json and calculated from the individual labels as described in the appendix 'Appendix B: Label combining algorithm' in the NAB whitepaper
Sometimes if two labels are close together, they will be combined into one window, so that might have happened here.
I want to use realAWS dataset for anomaly detection. But the labels for anomalies are not clear. In the labels folder, there are two files: combined_labels.json and combined_windows.json. In these two files for the system the entries do not match. For example:
"realAWSCloudwatch/ec2_cpu_utilization_24ae8d.csv": [ [ "2014-02-26 13:45:00.000000", "2014-02-27 06:25:00.000000" ], [ "2014-02-27 08:55:00.000000", "2014-02-28 01:35:00.000000" ] ],
It is the entry in combined_windows.json and
"realAWSCloudwatch/ec2_cpu_utilization_24ae8d.csv": [ "2014-02-26 22:05:00", "2014-02-27 17:15:00" ],
this is the entry in combined_labels.json.
Why is there such a mismatch. Which file is correct to be used?