h2oai / h2o4gpu

H2Oai GPU Edition
Apache License 2.0
460 stars 95 forks source link

h2o4gpu works with Bleeding Edge Cuda 9.0 (Anaconda Python 3.6), but fails with RStudio #674

Closed nikolayvoronchikhin closed 6 years ago

nikolayvoronchikhin commented 6 years ago

Environment: OS RHEL 7.3 Installed from: 3 different ways below (worked with bleeding edge) Anaconda Python 3.6 CUDA 9.0 ldd (GNU libc) 2.17

RStudio Server Version 1.1.414 Microsoft R Open Version 3.4.3

I am trying to install h2o4gpu correctly and test using the examples provided in this repo for both Python 3.6 and R. I have installed all the system prerequisites mentioned in the installation instructions for both Python 3.6 and R: https://github.com/h2oai/h2o4gpu

I have tried installing h2o4gpu for Python 3.6 using different options below.

  1. From source using make fullinstall Failed with different compilation errors.

  2. pip install --extra-index-url https://pypi.anaconda.org/gpuopenanalytics/simple h2o4gpu Gives same errors as in: https://github.com/h2oai/h2o4gpu/issues/257 Warning: h2o4gpu_kmeans_lib shared object (dynamic library) ch2o4gpu_cpu.so failed to load.

  3. Bleeding Edge for CUDA 9.0 (as suggested in: https://github.com/h2oai/h2o4gpu/issues/257) https://s3.amazonaws.com/h2o-release/h2o4gpu/releases/bleeding-edge/ai/h2o/h2o4gpu/0.2-cuda9/h2o4gpu-0.2.0.9999-cp36-cp36m-linux_x86_64.whl

This finally worked today! [~ bin]$ ./python Python 3.6.5 |Anaconda, Inc.| (default, Apr 29 2018, 16:14:56) [GCC 7.2.0] on linux Type "help", "copyright", "credits" or "license" for more information.

import h2o4gpu import numpy as np X = np.array([[1.,1.], [1.,4.], [1.,0.]]) model = h2o4gpu.KMeans(n_clusters=2,random_state=1234).fit(X) model.clustercenters array([[1., 1.], [1., 4.]])

Now, when I login to RStudio, it does not work with this error:

library("reticulate", lib.loc="~/.localRlibrary") use_python("~/anaconda3/bin/python") reticulate::py_discover_config("h2o4gpu")

library("h2o4gpu", lib.loc="~/.localRlibrary") Error: package or namespace load failed for ‘h2o4gpu’: .onLoad failed in loadNamespace() for 'h2o4gpu', details: call: py_module_import(module, convert = convert) error: ImportError: /lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by ~/anaconda3/lib/python3.6/site-packages/scipy/sparse/_sparsetools.cpython-36m-x86_64-linux-gnu.so)

x <- iris[1:4] y <- as.integer(iris$Species) - 1

model <- h2o4gpu.random_forest_classifier() %>% fit(x, y)

Error:

predictions <- model %>% predict(x)

Error in eval(lhs, parent, parent) : object 'model' not found

nikolayvoronchikhin commented 6 years ago

here is my ~/.bashrc export CUDA_HOME=/usr/local/cuda export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib64/:$CUDA_HOME/lib/:$CUDA_HOME/extras/CUPTI/lib64 export LD_LIBRARY_PATH="~/anaconda3/lib:$LD_LIBRARY_PATH"

Trying from R Console instead, it gets aborted at the model step.

[ ~]$ R

R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree" Copyright (C) 2017 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R.

^[[AMicrosoft R Open 3.4.3 The enhanced R distribution from Microsoft Microsoft packages Copyright (C) 2018 Microsoft

Loading Microsoft R Client packages, version 3.4.3.0097. Microsoft R Client limits some functions to available memory. See: https://go.microsoft.com/fwlink/?linkid=799476 for information about additional features.

Type 'readme()' for release notes, privacy() for privacy policy, or 'RevoLicense()' for licensing information.

Using the Intel MKL for parallel mathematical computing (using 36 cores). Default CRAN mirror snapshot taken on 2018-01-01. See: https://mran.microsoft.com/.

[Previously saved workspace restored]

library("reticulate", lib.loc="~/.localRlibrary")
use_python("~/anaconda3/bin/python") reticulate::py_discover_config("h2o4gpu")
python: ~/anaconda3/bin/python libpython: ~/anaconda3/lib/libpython3.6m.so pythonhome: ~/anaconda3:/users/nvoronch/testing/anaconda3 version: 3.6.5 |Anaconda, Inc.| (default, Apr 29 2018, 16:14:56) [GCC 7.2.0] numpy: ~/anaconda3/lib/python3.6/site-packages/numpy numpy_version: 1.15.0 h2o4gpu: ~/anaconda3/lib/python3.6/site-packages/h2o4gpu

python versions found: ~/anaconda3/bin/python /usr/bin/python

library("h2o4gpu", lib.loc="~/.localRlibrary")

Attaching package: ‘h2o4gpu’

The following object is masked from ‘package:base’:

transform

x <- iris[1:4]
y <- as.integer(iris$Species) - 1
model <- h2o4gpu.random_forest_classifier() %>% fit(x, y) ~/anaconda3/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88 return f(*args, **kwds) ~/anaconda3/lib/python3.6/site-packages/h2o4gpu/ensemble/weight_boosting.py:29: DeprecationWarning: numpy.core.umath_tests is an internal NumPy module and should not be imported. It will be removed in a future NumPy release. from numpy.core.umath_tests import inner1d terminate called after throwing an instance of 'thrust::system::system_error' what(): /root/repo/xgboost/src/common/host_device_vector.cu(82): invalid argument Aborted [ ~]$

nikolayvoronchikhin commented 6 years ago

@mdymczyk Who should I contact for this issue?

mdymczyk commented 6 years ago

@nikolayvoronchikhin I see in your logs that you are using Anaconda. As we mentioned in our README we do not (yet) support or test for conda. There might be a series of issues in Anaconda with libraries it installs by default - and why h2o4gpu wouldn't be able to load the underlying C/C++ binaries.

@hemenkapadia is working on a Conda distribution https://github.com/h2oai/h2o4gpu/pull/671 and we should have it soon-ish.

nikolayvoronchikhin commented 6 years ago

@mdymczyk Anaconda version is now working fine, however in R, I am getting this error now:

[~]$ R

R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree" Copyright (C) 2017 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit)

Microsoft R Open 3.4.3 The enhanced R distribution from Microsoft Microsoft packages Copyright (C) 2018 Microsoft

Loading Microsoft R Client packages, version 3.4.3.0097. ...

library("reticulate", lib.loc="~/.localRlibrary")
use_python("~/anaconda3/bin/python")
x <- iris[1:4]
y <- as.integer(iris$Species) - 1
reticulate::py_discover_config("h2o4gpu")
model <- h2o4gpu.random_forest_classifier() %>% fit(x, y) Error in h2o4gpu.random_forest_classifier() %>% fit(x, y) : could not find function "%>%" pred <- model %>% predict(x) Error in model %>% predict(x) : could not find function "%>%"

Same error occurs for all the examples here: https://cran.r-project.org/web/packages/h2o4gpu/vignettes/getting_started.html

hemenkapadia commented 6 years ago

Hi @nikolayvoronchikhin ,

We released conda packages for h2o4gpu last week. The README is updated with instructions of how to setup a conda environment with h2o4gpu downloaded from h2oai channel on anaconda cloud.

To debug I suggest the foillowing steps

  1. Create a conda environment with h2o4gpu.
  2. Then proceed to run a Jypyter Notebook in the conda environment you just created.
  3. Then try manually executing h2o4gpu GLM notebook in the running kernel.

If all works as expected then we have ensure h2o4gpu is installed properly and working fine.

Next we proceed to use reticulate with the conda environment you setup. See the use_conda method to connect to the conda environment.

Let us know your experience with that.

Regards, Hemen

nikolayvoronchikhin commented 6 years ago

Hi @hemenkapadia ,

Tested the conda approach and h2o4gpu GLM notebook works fine in JupyterHub Python 3 Kernel and Terminal.

Used reticulate approach with conda environment and works fine in R Shell, but failed in both RStudio & JupyterHub IR Kernel. What else do I need to make it work in RStudio & JupyterHub IR Kernel?

Thanks, Nikolay

hemenkapadia commented 6 years ago

Hi @nikolayvoronchikhin,

What is the failure message you get in RStudio ?

Regards, Hemen

nikolayvoronchikhin commented 6 years ago

Same error occurs in RStudio & JupyterHub IR Kernel

library("reticulate", lib.loc="~/R/x86_64-pc-linux-gnu-library/3.4")
use_condaenv(condaenv = "h2o4gpuenv", conda = "~/anaconda3/bin/conda") library("h2o4gpu", lib.loc="~/R/x86_64-pc-linux-gnu-library/3.4") x <- iris[1:4] y <- as.integer(iris$Species) ​ Attaching package: ‘h2o4gpu’

The following object is masked from ‘package:base’:

transform

model <- h2o4gpu.random_forest_classifier() %>% fit(x, y) Error: Traceback:

  1. h2o4gpu.random_forest_classifier() %>% fit(x, y)
  2. eval(lhs, parent, parent)
  3. eval(lhs, parent, parent)
  4. h2o4gpu.random_forest_classifier()
  5. h2o4gpu$RandomForestClassifier
  6. $.python.builtin.module(h2o4gpu, RandomForestClassifier)
  7. py_resolve_module_proxy(x)
  8. on_error(result)
  9. stop(e$error_message, call. = FALSE)
nikolayvoronchikhin commented 6 years ago

Hi @hemenkapadia , What should I try next?

Thanks, Nikolay

hemenkapadia commented 6 years ago

Hi @nikolayvoronchikhin ,

Can you try with preview of Rstudio 1.2 - https://www.rstudio.com/products/rstudio/download/preview/ I read an article that it had better support for reticulate.

Since R shell seems to work fine I believe this to be an issue in RStudio and possibly the better support for reticulate in 1.2 my resolve it for you.

Regards, Hemen

nikolayvoronchikhin commented 6 years ago

Hi @hemenkapadia ,

Installed RStudio Server 1.2.1060, reinstalled with reticulate 1.10 (from TAR latest version) and h2o4gpu 0.3.0.9999 (from GitHub development version), still having the same error as above.

Fails in: RStudio, Jupyter IR Kernel Works in: Jupyter Terminal (Shell), RStudio Terminal (Shell), Bash (Shell)

library("reticulate", lib.loc="/R/x86_64-pc-linux-gnu-library/3.4") use_condaenv(condaenv = "h2o4gpuenv", conda = "/anaconda3/bin/conda") library("h2o4gpu", lib.loc="~/R/x86_64-pc-linux-gnu-library/3.4") x <- iris[1:4] y <- as.integer(iris$Species) ​ Attaching package: ‘h2o4gpu’

The following object is masked from ‘package:base’:

transform model <- h2o4gpu.random_forest_classifier() %>% fit(x, y) Error: Traceback:

h2o4gpu.random_forest_classifier() %>% fit(x, y) eval(lhs, parent, parent) eval(lhs, parent, parent) h2o4gpu.random_forest_classifier() h2o4gpu$RandomForestClassifier $.python.builtin.module(h2o4gpu, RandomForestClassifier) py_resolve_module_proxy(x) on_error(result) stop(e$error_message, call. = FALSE)

Thanks, Nikolay

nikolayvoronchikhin commented 6 years ago

Hi @hemenkapadia

It is now working in RStudio Server.

Thanks, Nikolay