h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.92k stars 2k forks source link

h2o stacked ensemble throws NPE on 3.26.0.3 #8774

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

{code:java} 09-04 17:47:26.070 10.168.0.136:54321 8742 FJ-1-15 INFO: Starting model se 09-04 17:47:26.103 10.168.0.136:54321 8742 FJ-1-15 INFO: Completing model se 09-04 17:47:26.134 10.168.0.136:54321 8742 FJ-1-15 ERRR: java.lang.NullPointerException 09-04 17:47:26.134 10.168.0.136:54321 8742 FJ-1-15 ERRR: at hex.ensemble.StackedEnsembleModel.checkAndInheritModelProperties(StackedEnsembleModel.java:377) 09-04 17:47:26.134 10.168.0.136:54321 8742 FJ-1-15 ERRR: at hex.ensemble.StackedEnsemble$StackedEnsembleDriver.computeImpl(StackedEnsemble.java:247) 09-04 17:47:26.134 10.168.0.136:54321 8742 FJ-1-15 ERRR: at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:222) 09-04 17:47:26.134 10.168.0.136:54321 8742 FJ-1-15 ERRR: at water.H2O$H2OCountedCompleter.compute(H2O.java:1417) 09-04 17:47:26.134 10.168.0.136:54321 8742 FJ-1-15 ERRR: at jsr166y.CountedCompleter.exec(CountedCompleter.java:468) 09-04 17:47:26.134 10.168.0.136:54321 8742 FJ-1-15 ERRR: at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) 09-04 17:47:26.134 10.168.0.136:54321 8742 FJ-1-15 ERRR: at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974) 09-04 17:47:26.134 10.168.0.136:54321 8742 FJ-1-15 ERRR: at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) 09-04 17:47:26.134 10.168.0.136:54321 8742 FJ-1-15 ERRR: at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) 09-04 17:47:57.921 10.168.0.136:54321 8742 #download INFO: GET /3/Logs/download, parms: {} 09-04 17:47:57.925 1 {code}

exalate-issue-sync[bot] commented 1 year ago

Sebastien Poirier commented: [~accountid:557058:5bcbac08-75cf-4c6b-b4d2-294f7c0fe9b8] do you have more details on how this NPE was obtained?

Just looking at the code, it looks like some model was built without any response column, which is allowed only for unsupervised algorithms. Is it possible that user was trying to stack unsupervised models? StackedEnsemble doesn’t support that.

The NPE is bad, but I need to know which scenario we’re trying to prevent to translate this into a more meaningful error, thanks.

exalate-issue-sync[bot] commented 1 year ago

Sebastien Poirier commented: According to detailed logs, user was trying to stack an ensemble (using blending) of {{Generic Models}}, and those lack various {{params}} including {{responseColumn}} which raised the NPE.

Now the logs don’t provide enough details to know why the {{generic model}} (the first one raising the NPE being constructed from a {{GLM mojo}}) didn’t have any {{responseColumn}}. Note that they were built using the Python client from {{path}} param.

exalate-issue-sync[bot] commented 1 year ago

Sebastien Poirier commented: For now, I can just fix the NPE and replace it with a {{H2OIllegalArgumentException}} with message: {{StackedModel {id} is missing response_column {column_name}}}

exalate-issue-sync[bot] commented 1 year ago

Sebastien Poirier commented: NPE is fixed, this ticket should however be closed once [https://0xdata.atlassian.net/browse/PUBDEV-6898|https://0xdata.atlassian.net/browse/PUBDEV-6898|smart-link] is resolved

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-6860 Assignee: Michal Kurka Reporter: Nidhi Mehta State: In Progress Fix Version: N/A Attachments: N/A Development PRs: Available

Linked PRs from JIRA

https://github.com/h2oai/h2o-3/pull/3903