aws / random-cut-forest-by-aws

An implementation of the Random Cut Forest data structure for sketching streaming data, with support for anomaly detection, density estimation, imputation, and more.
https://github.com/aws/random-cut-forest-by-aws
Apache License 2.0
206 stars 33 forks source link

Incomplete State 255 Error #328

Closed amitgalitz closed 2 years ago

amitgalitz commented 2 years ago

Steps to reproduce on RCF:

  1. Convert following model in JSON format to trcf: using following unit test:

Seed given: 4956439335943109988 Feature input: 88.0

  1. Exception is thrown:

    java.lang.IllegalArgumentException:  incomplete state 255
    
    at com.amazon.randomcutforest.CommonUtils.checkArgument(CommonUtils.java:42)
    at com.amazon.randomcutforest.tree.AbstractNodeStore.growNodeBox(AbstractNodeStore.java:315)
    at com.amazon.randomcutforest.tree.AbstractNodeStore.growNodeBox(AbstractNodeStore.java:327)
    at com.amazon.randomcutforest.tree.RandomCutTree.addPoint(RandomCutTree.java:252)
    at com.amazon.randomcutforest.tree.RandomCutTree.addPoint(RandomCutTree.java:51)
    at com.amazon.randomcutforest.state.RandomCutForestMapper.lambda$singlePrecisionForest$1(RandomCutForestMapper.java:339)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1510)
    at com.amazon.randomcutforest.state.RandomCutForestMapper.singlePrecisionForest(RandomCutForestMapper.java:339)
    at com.amazon.randomcutforest.state.RandomCutForestMapper.toModel(RandomCutForestMapper.java:231)
    at com.amazon.randomcutforest.state.RandomCutForestMapper.toModel(RandomCutForestMapper.java:55)
    at com.amazon.randomcutforest.state.IContextualStateMapper.toModel(IContextualStateMapper.java:24)
    at com.amazon.randomcutforest.state.RandomCutForestMapper.toModel(RandomCutForestMapper.java:299)
    at com.amazon.randomcutforest.parkservices.state.ThresholdedRandomCutForestMapper.toModel(ThresholdedRandomCutForestMapper.java:50)
    at com.amazon.randomcutforest.parkservices.state.ThresholdedRandomCutForestMapper.toModel(ThresholdedRandomCutForestMapper.java:38)
    at com.amazon.randomcutforest.state.IStateMapper.toModel(IStateMapper.java:24)
    at com.amazon.randomcutforest.parkservices.state.V2TRCFToV3StateConverterTest.testJson(V2TRCFToV3StateConverterTest.java:51)...

Steps to reproduce to get model from AD, Opensearch 2.0:

  1. Load training data with 100 entities and create AD over it
  2. print trcf model when exception is thrown on one of the entities
  3. Deserialize model produced for the entity and there will be an incomplete State 255 error
ylwu-amzn commented 2 years ago

Fixed by this PR https://github.com/aws/random-cut-forest-by-aws/pull/329?

amitgalitz commented 2 years ago

Yes this has been fixed and tested with that PR.