Closed ylwu-amzn closed 2 years ago
Tested every commit after PR #282 (AD is using this commit in 1.3.0), after PR https://github.com/aws/random-cut-forest-by-aws/pull/284 , RCF code can't deserialize 1.3.0 AD RCF model any more. This is different with yesterday's test result that RCF commit https://github.com/aws/random-cut-forest-by-aws/commit/2fedfeac8c325b7490594462aa6979659817f9f4 can deserialize AD 1.3.0 model. Will confirm with @amitgalitz to learn why the test results are different. But we still prefer to revert the PR #282 to not add any overhead about AD BWC. And we plan to use the same way of AD to serialize/deserialize RCF model. That will make our future BWC support easier. But cons is we have 2 serialization ways in ml-commons. Check PR https://github.com/opensearch-project/ml-commons/pull/251
Check more details on this issue https://github.com/aws/random-cut-forest-by-aws/issues/305#issuecomment-1079360251
To test the impact of my PR https://github.com/aws/random-cut-forest-by-aws/pull/298 , I applied it to commit https://github.com/aws/random-cut-forest-by-aws/commit/3bb8fcee46507ba83e6ce10432df45254654c363 (PR https://github.com/aws/random-cut-forest-by-aws/pull/283) which used by AD 1.3.0, check my branch https://github.com/ylwu-amzn/random-cut-forest-by-aws/tree/test_ad_protostuff, commit https://github.com/ylwu-amzn/random-cut-forest-by-aws/commit/bc7944eca8d24f008352c57dbe199e47e52958cb. Test locally, it can parse AD 1.3.0 model successfully.
As explained above, PR #298 doesn't impact the deserialization of old AD model. And Sudipto fixed the backward compatibility issue in PR #309. So seems no need to revert the PR #298. Will close this PR for now. If we find PR #298 impacts any part, will revert then.
Signed-off-by: Yaliang Wu ylwu@amazon.com
AD is using protostuff to serialize/deserialize RCF model. With the change "implement Serializable for ThresholdedRandomCutForestState", AD can't deserialize old models. So we have to revert this change. At the same time, we will use the same way of AD to serialize/deserialize RCF in
ml-commons
.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.