h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.79k stars 1.99k forks source link

AutoML NPE in benchmark #12518

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

http://mr-0xc1:8080/view/H2OAI/job/h2oai-benchmark-quick/1259/console

Starting automl original with runtime 97 04:49:26 AutoML progress: |███████ (failed) 04:49:26 Traceback (most recent call last): 04:49:26 File "/opt/benchmarks/H2OAIBenchmark.py", line 699, in 04:49:26 do_benchmark(config_file, git_sha, build_number, h2oai_git_sha, runtime_id) 04:49:26 File "/opt/benchmarks/H2OAIBenchmark.py", line 673, in do_benchmark 04:49:26 run(config_file, git_sha, build_number, h2oai_git_sha, runtime_id) # Config file path, git-sha, build-number 04:49:26 File "/opt/benchmarks/H2OAIBenchmark.py", line 667, in run 04:49:26 detect_time_series=detect_time_series, test_file_path=test_file_path) 04:49:26 File "/opt/benchmarks/H2OAIBenchmark.py", line 366, in run_benchmark 04:49:26 stopping_metric=stopping_metric) 04:49:26 File "/opt/benchmarks/H2OAIBenchmark.py", line 57, in do_automl 04:49:26 leaderboard_frame=leaderboard_frame) 04:49:26 File "/h2oai_env/lib/python3.6/site-packages/h2o/automl/autoh2o.py", line 363, in train 04:49:26 self._job.poll() 04:49:26 File "/h2oai_env/lib/python3.6/site-packages/h2o/job.py", line 77, in poll 04:49:26 "\n{}".format(self.job_key, self.exception, self.job["stacktrace"])) 04:49:26 OSError: Job with key $03017f00000132d4ffffffff$_a8617a30d474832781b58e1213324f75 failed with an exception: java.lang.NullPointerException 04:49:26 stacktrace: 04:49:26 java.lang.NullPointerException 04:49:26 at ai.h2o.automl.Leaderboard$1.atomic(Leaderboard.java:291) 04:49:26 at ai.h2o.automl.Leaderboard$1.atomic(Leaderboard.java:252) 04:49:26 at water.TAtomic.atomic(TAtomic.java:17) 04:49:26 at water.Atomic.compute2(Atomic.java:56) 04:49:26 at water.Atomic.fork(Atomic.java:39) 04:49:26 at water.Atomic.invoke(Atomic.java:31) 04:49:26 at ai.h2o.automl.Leaderboard.addModels(Leaderboard.java:341) 04:49:26 at ai.h2o.automl.Leaderboard.addModel(Leaderboard.java:388) 04:49:26 at ai.h2o.automl.AutoML.addModel(AutoML.java:1332) 04:49:26 at ai.h2o.automl.AutoML.pollAndUpdateProgress(AutoML.java:485) 04:49:26 at ai.h2o.automl.AutoML.learn(AutoML.java:1039) 04:49:26 at ai.h2o.automl.AutoML.run(AutoML.java:369) 04:49:26 at ai.h2o.automl.H2OJob$1.compute2(H2OJob.java:32) 04:49:26 at water.H2O$H2OCountedCompleter.compute(H2O.java:1260) 04:49:26 at jsr166y.CountedCompleter.exec(CountedCompleter.java:468) 04:49:26 at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263) 04:49:26 at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974) 04:49:26 at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477) 04:49:26 at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) 04:49:26 04:49:26 H2O session _sid_b804 closed.

exalate-issue-sync[bot] commented 1 year ago

Magnus Stensmo commented: Got this again just now

Starting automl original with runtime 253
AutoML progress: |██████████ (failed)
Job with key $03017f00000132d4ffffffff$_a2426062423dd9b567f4d669501a4e8e failed with an exception: java.lang.NullPointerException
stacktrace: 
java.lang.NullPointerException
    at ai.h2o.automl.Leaderboard$1.atomic(Leaderboard.java:291)
    at ai.h2o.automl.Leaderboard$1.atomic(Leaderboard.java:252)
    at water.TAtomic.atomic(TAtomic.java:17)
    at water.Atomic.compute2(Atomic.java:56)
    at water.Atomic.fork(Atomic.java:39)
    at water.Atomic.invoke(Atomic.java:31)
    at ai.h2o.automl.Leaderboard.addModels(Leaderboard.java:341)
    at ai.h2o.automl.AutoML.addModels(AutoML.java:1318)
    at ai.h2o.automl.AutoML.pollAndUpdateProgress(AutoML.java:475)
    at ai.h2o.automl.AutoML.defaultGBMs(AutoML.java:769)
    at ai.h2o.automl.AutoML.learn(AutoML.java:1056)
    at ai.h2o.automl.AutoML.run(AutoML.java:369)
    at ai.h2o.automl.H2OJob$1.compute2(H2OJob.java:32)
    at water.H2O$H2OCountedCompleter.compute(H2O.java:1260)
    at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
    at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
    at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
    at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
    at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

Traceback (most recent call last):
  File "/opt/benchmarks/H2OAIBenchmark.py", line 58, in do_automl
    leaderboard_frame=leaderboard_frame)
  File "/h2oai_env/lib/python3.6/site-packages/h2o/automl/autoh2o.py", line 363, in train
    self._job.poll()
  File "/h2oai_env/lib/python3.6/site-packages/h2o/job.py", line 77, in poll
    "\n{}".format(self.job_key, self.exception, self.job["stacktrace"]))
OSError: Job with key $03017f00000132d4ffffffff$_a2426062423dd9b567f4d669501a4e8e failed with an exception: java.lang.NullPointerException
stacktrace: 
java.lang.NullPointerException
    at ai.h2o.automl.Leaderboard$1.atomic(Leaderboard.java:291)
    at ai.h2o.automl.Leaderboard$1.atomic(Leaderboard.java:252)
    at water.TAtomic.atomic(TAtomic.java:17)
    at water.Atomic.compute2(Atomic.java:56)
    at water.Atomic.fork(Atomic.java:39)
    at water.Atomic.invoke(Atomic.java:31)
    at ai.h2o.automl.Leaderboard.addModels(Leaderboard.java:341)
    at ai.h2o.automl.AutoML.addModels(AutoML.java:1318)
    at ai.h2o.automl.AutoML.pollAndUpdateProgress(AutoML.java:475)
    at ai.h2o.automl.AutoML.defaultGBMs(AutoML.java:769)
    at ai.h2o.automl.AutoML.learn(AutoML.java:1056)
    at ai.h2o.automl.AutoML.run(AutoML.java:369)
    at ai.h2o.automl.H2OJob$1.compute2(H2OJob.java:32)
    at water.H2O$H2OCountedCompleter.compute(H2O.java:1260)
    at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
    at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
    at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
    at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
    at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

Writing H2O logs to /tmp/bnpparibas_unmunged_h2o_logs.zip
exiting
H2O session _sid_b083 closed.

http://mr-0xc1:8080/view/H2OAI/job/h2oai-benchmark-oneepoch/871/console

exalate-issue-sync[bot] commented 1 year ago

Navdeep commented: This should fix: https://github.com/h2oai/h2o-3/commit/1cb84d2128cc8d6044c849b893d620ee315a5ec2. If not, please reopen.

exalate-issue-sync[bot] commented 1 year ago

Erin LeDell commented: [~accountid:557058:89297402-cb5a-4710-9511-20f42b25451a] and [~accountid:557058:aa4294f1-b3c0-4c17-9e68-2efb4194388b] I think this should be re-opened now that the commit has been rolled back.

hasithjp commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-5659 Assignee: Navdeep Gill Reporter: Magnus Stensmo State: Reopened Fix Version: N/A Attachments: N/A Development PRs: N/A