wandnz / streamevmon

Framework and pipeline for time series anomaly detection
GNU General Public License v3.0
1 stars 1 forks source link

Improve NAB scores #31

Closed wandgitlabbot closed 3 years ago

wandgitlabbot commented 3 years ago

In GitLab, by Daniel Oosterwijk on 2020-08-21

The existing detectors (baseline, changepoint, distdiff, mode, spike) have been tested against the NAB dataset, but the results were not very impressive. Three of the five detectors produced no events, and the two that did had a low accuracy. I believe that these scores could be improved by tuning the detectors' configurations to the dataset. NAB testing mandates that each detector must have a single config for the entire dataset, and that detectors may not look ahead at what's to come to pre-tune themselves. Luckily, that's how we're already set up :)

The results are as follows:

Detector Standard Profile Reward Low FP Reward Low FN
Baseline 0.00 0.00 2.66
Changepoint 0.00 0.00 0.00
DistDiff 10.22 3.88 17.22
Mode 0.00 0.00 0.00
Spike 0.00 0.00 0.00

The official NAB scoreboard as of the time of writing is replicated below:

Detector Standard Profile Reward Low FP Reward Low FN
Perfect 100.0 100.0 100.0
Numenta HTM* 70.5-69.7 62.6-61.7 75.2-74.2
CAD OSE 69.9 67.0 73.2
earthgecko Skyline 58.2 46.2 63.9
KNN CAD 58.0 43.4 64.8
Relative Entropy 54.6 47.6 58.8
Random Cut Forest **** 51.7 38.4 59.7
Twitter ADVec v1.0.0 47.1 33.6 53.5
Windowed Gaussian 39.6 20.9 47.4
Etsy Skyline 35.7 27.1 44.5
Bayesian Changepoint** 17.7 3.2 32.2
EXPoSE 16.4 3.2 26.9
Random*** 11.0 1.2 19.5
Null 0.0 0.0 0.0
wandgitlabbot commented 3 years ago

In GitLab, by Daniel Oosterwijk on 2020-08-23

I'm considering using some form of automated parameter tuning to see if I can reduce the workload of doing this manually. This would involve a fair bit of work before it functions, but is likely to be useful on other workloads once complete.

Currently, we have an entrypoint that runs all the detectors against the NAB dataset. We then do some post-processing of the results in Python before running it against the NAB scorer, also in Python (although we run the scorer via a Bash script). Automating this would involve the following steps:

I'm going to spin these steps off into several issues, and maybe even try out Gitlab's Milestones feature.

wandgitlabbot commented 3 years ago

In GitLab, by Daniel Oosterwijk on 2021-01-21

I've finished parsing the logs for the final optimisation run, and talked about it in the wiki at Parameter Tuning Results. I'll include the script I used to run the tests in the repo, and close the issue.