h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.92k stars 2k forks source link

Performance in Isolation Forest model not available in R but is available in Python #8072

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

Inconsistency in Isolation Forest performance reporting in Python and R. R does not return any scores. Am I doing something wrong?

Python {code} from h2o.estimators.isolation_forest import H2OIsolationForestEstimator train = h2o.import_file("http://s3.amazonaws.com/h2o-public-test-data/smalldata/anomaly/ecg_discord_train.csv") test = h2o.import_file("http://s3.amazonaws.com/h2o-public-test-data/smalldata/anomaly/ecg_discord_test.csv") isofor_model = H2OIsolationForestEstimator(sample_size = 5, ntrees = 7, seed = 12345) isofor_model.train(training_frame = train) perf = isofor_model.model_performance() perf

ModelMetricsAnomaly: isolationforest Reported on train data.

Anomaly Score: 1.7285714285714286 Normalized Anomaly Score: 0.6555555555555554 {code}

R {code} train <- h2o.importFile("http://s3.amazonaws.com/h2o-public-test-data/smalldata/anomaly/ecg_discord_train.csv") test <- h2o.importFile("http://s3.amazonaws.com/h2o-public-test-data/smalldata/anomaly/ecg_discord_test.csv") isofor_model <- h2o.isolationForest(training_frame = train, sample_size = 5, ntrees = 7, seed = 12345) perf <- h2o.performance(isofor_model) H2OAnomalyDetectionMetrics: isolationforest Reported on training data. Metrics reported on Out-Of-Bag training samples {code}

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-7566 Assignee: New H2O Bugs Reporter: Angela Bartz State: Open Fix Version: Backlog Attachments: N/A Development PRs: N/A