Is your feature request related to a problem? Please describe.Isolation Forest (IF) is a popular unsupervised anomaly detection method used to identify fraud. Ex. Banks and Retail companies use IF to determine zero day threats i.e new patterns in threats which supervised algorithms like XGBoost and GNN are unable to determine because of class imbalance or other issues.
While cuML supports inferencing on scikit-learn's IF model via ForestInference Library (experimental feature) (Issue #3838), it would be great to have IF model training implemented in cuML similar to the implementation of Isolation Forest in scikit-learn
Describe the solution you'd like
Something like below -
from cuml.ensemble import IsolationForest
X = [[-1.1], [0.3], [0.5], [100]]
clf = IsolationForest(random_state=0).fit(X)
clf.predict([[0.1], [0], [90]])
Implementation Details
The following needs to be implemented and tested in cuML to enable IF-
Splitting the decision tree randomly while building the trees via NodeSplitKernel
Implementation for calculating path length to detect anomalies similar to scikit-learn implementation HERE
Is your feature request related to a problem? Please describe. Isolation Forest (IF) is a popular unsupervised anomaly detection method used to identify fraud. Ex. Banks and Retail companies use IF to determine zero day threats i.e new patterns in threats which supervised algorithms like XGBoost and GNN are unable to determine because of class imbalance or other issues.
While cuML supports inferencing on scikit-learn's IF model via ForestInference Library (experimental feature) (Issue #3838), it would be great to have IF model training implemented in cuML similar to the implementation of Isolation Forest in scikit-learn
Describe the solution you'd like Something like below -
Implementation Details The following needs to be implemented and tested in cuML to enable IF-
@vinaydes @dantegd @beckernick @hcho3