aws / random-cut-forest-by-aws

An implementation of the Random Cut Forest data structure for sketching streaming data, with support for anomaly detection, density estimation, imputation, and more.
https://github.com/aws/random-cut-forest-by-aws
Apache License 2.0
206 stars 33 forks source link

Make pastValues independent of forecasts #388

Closed sudiptoguha closed 1 year ago

sudiptoguha commented 1 year ago

RCFs can produce the expected value while scoring an anomaly -- however that expected value computation requires more data than is required to determine anomaly/non-anomaly. In addition, some anomalies will always be detected late. Currently, if there is a late anomaly detected within the first 100 points then the past values are not set. While this can be fixed in the code invoking RCF, it may be simpler to have it solved inside RCF.