aws / random-cut-forest-by-aws

An implementation of the Random Cut Forest data structure for sketching streaming data, with support for anomaly detection, density estimation, imputation, and more.
https://github.com/aws/random-cut-forest-by-aws
Apache License 2.0
206 stars 33 forks source link

fix for bit conversion #329

Closed sudiptoguha closed 2 years ago

sudiptoguha commented 2 years ago

Issue #, if available: 328

Description of changes: The index manager used in multiple store classes used tobits() to convert an array into a sequence of bits during deserialization and as a consequence 256 was considered as 0. This appeared only in the case of PointStore (and in a presence of sufficiently many duplicates) because the trees were small. Once the PointStore is corrupted, it would start showing "incorrect state" errors or attempt to delete incorrect points.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.