h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.87k stars 2k forks source link

GBM: Strange warning regarding a predictor variable which was not added to model #12060

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

I'm using a factor variable in R to perform a custom k-fold cross-validation. This column, called 'foldset', is a factor with levels 1 to 5. Level 1 is reserved for test purposes and levels 2:5 are used as a custom 'fold_column'. This column 'foldset' is not used as a predictor variable (it is not in the array of inputs 'x', but it is in training h2o.frame object). However, when I compute predictions for the test set ('foldset = 1') I'm experiencing a strange warning: "Test/Validation dataset column 'foldset' has levels not trained on: [1]". This is very strange, since this column is not supposed to be a predictor variable.

hasithjp commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-5188 Assignee: Michal Kurka Reporter: Victor Teixeira de Melo Mayrink State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A