H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
I try to train a XGBoost model with a GPU instance (P3.16xlarge), but it reports error:
OSError: Job with key $03017f000001c68bffffffff$_908b7c8320882d5230bc213e87c44498 failed with an exception: java.lang.NullPointerException
stacktrace:
java.lang.NullPointerException
at hex.tree.xgboost.matrix.SparseMatrixFactory$NestedArrayPointer.set(SparseMatrixFactory.java:87)
at hex.tree.xgboost.matrix.SparseMatrixFactory$InitializeCSRMatrixFromChunkIdsMrFun.map(SparseMatrixFactory.java:166)
at water.LocalMR.compute2(LocalMR.java:84)
at water.LocalMR.compute2(LocalMR.java:76)
at water.LocalMR.compute2(LocalMR.java:76)
at water.LocalMR.compute2(LocalMR.java:76)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1417)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
H2O version 3.26.0.2
features: 595
Checking whether there is an H2O instance running at http://localhost:35781 ..... not found.
Attempting to start a local H2O server...
Java Version: java version "1.8.0_162"; Java(TM) SE Runtime Environment (build 1.8.0_162-b12); Java HotSpot(TM) 64-Bit Server VM (build 25.162-b12, mixed mode)
Starting server from /usr/local/lib/python3.6/site-packages/h2o/backend/bin/h2o.jar
Ice root: workspace_wish_v18_36_gpu/h2o.train.WISH_V18_36/2019-09-27_20-53-43
JVM stdout: /tmp/tmpfv52e8j9/h2o_hadoop_started_from_python.out
JVM stderr: /tmp/tmpfv52e8j9/h2o_hadoop_started_from_python.err
Server is running at http://127.0.0.1:35781
Connecting to H2O server at http://127.0.0.1:35781 ... successful.
-------------------------- ---------------------------------------------------
H2O cluster uptime: 01 secs
H2O cluster timezone: Etc/UTC
H2O data parsing timezone: UTC
H2O cluster version: 3.26.0.2
H2O cluster version age: 2 months
H2O cluster name: H2O_from_python_hadoop_sp3kyr
H2O cluster total nodes: 1
H2O cluster free memory: 368 Gb
H2O cluster total cores: 64
H2O cluster allowed cores: 64
H2O cluster status: accepting new members, healthy
H2O connection url: http://127.0.0.1:35781
H2O connection proxy:
H2O internal security: False
H2O API Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, Core V4
Python version: 3.6.8 final
-------------------------- ---------------------------------------------------
Parse progress: |█████████████████████████████████████████████████████████| 100%
xgboost Model Build progress: |██ (failed)
I try to train a XGBoost model with a GPU instance (P3.16xlarge), but it reports error:
H2O version 3.26.0.2 features: 595