aws / random-cut-forest-by-aws

An implementation of the Random Cut Forest data structure for sketching streaming data, with support for anomaly detection, density estimation, imputation, and more.
https://github.com/aws/random-cut-forest-by-aws
Apache License 2.0
210 stars 33 forks source link

Add memory estimation #295

Closed ylwu-amzn closed 2 years ago

ylwu-amzn commented 2 years ago

It will be good if we can know how much memory a model will use to evaluate if we have enough memory to run a new model. AD estimates memory usage https://github.com/opensearch-project/anomaly-detection/blob/a79028f8ee2098fa94e5e4b8ba81affce24b2b8c/src/main/java/org/opensearch/ad/MemoryTracker.java#L261. Now we are going to integrate RCF in MLCommons, that means we need to copy the same code to MLCommons. How about we build memory estimation in this RCF lib directly?

sudiptoguha commented 2 years ago

No. As stated "AD estimates memory usage..." that belongs in AD. AD should perform benchmarking.