aws / random-cut-forest-by-aws

An implementation of the Random Cut Forest data structure for sketching streaming data, with support for anomaly detection, density estimation, imputation, and more.
https://github.com/aws/random-cut-forest-by-aws
Apache License 2.0
210 stars 33 forks source link

merging models #267

Closed sudiptoguha closed 3 years ago

sudiptoguha commented 3 years ago

Description of changes: Merges a collection of 1.0 models (with same parameters) into a single forest with number of trees less or equal to the sum of the number of trees across the different models.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

sudiptoguha commented 3 years ago

when sum is smaller, why this function returns empty and line 172 throws an exception? should the behaviors be the same?

No the behavior does not need to be the same. line 172 should never be executed. It is essential that the number of trees are never changed in the state class without knowing what one is doing. Line 172 should prevent adventures.