aws / random-cut-forest-by-aws

An implementation of the Random Cut Forest data structure for sketching streaming data, with support for anomaly detection, density estimation, imputation, and more.
https://github.com/aws/random-cut-forest-by-aws
Apache License 2.0
206 stars 33 forks source link

Sample Size & Rust #379

Open acpeakhour opened 1 year ago

acpeakhour commented 1 year ago

Hi,

What is the equivalent for sample size in the Rust version?

sudiptoguha commented 1 year ago

Take a look at https://github.com/aws/random-cut-forest-by-aws/blob/main/Rust/tests/basicrcftest.rs

It is capacity (maximum number of leaves in each tree -- that is where the samples should end up). But the meta point is well taken -- the variable name should be changed in outer rcf. Also examples (similar to Java version) would help.