h2oai / h2o4gpu

H2Oai GPU Edition
Apache License 2.0
460 stars 95 forks source link

AttributeError: Coefficients are not defined for Booster type gbtree issue when calling h2o4gpu.random_forest_classifier() %>% fit(x, y) #765

Closed zojeda closed 5 years ago

zojeda commented 5 years ago

Environment (for bugs)

Description

After a fresh install of the r package I get an error running the example providing for R

The problems seems related to attributes attachment in R package ... Trying to access to attributes that rise exceptions in python code xgboost/sklearn.py,

Adding a simple try to the assignment fixes the issue, but the error msg still appear, a proper exclusion of attributes should be done to prevent this error accorting to the booster used.

Frankie-Figz commented 5 years ago

I confirm that I am also receiving this error message. I had to revert back to version 0.3.0 for CUDA 9 to get it working.

savvyyabby commented 5 years ago

I have this too ... Ubuntu 18.04 installed with latest CUDA 10 wheel. I tried using the h2o4gpu-cuda92 conda install. That works with R without this bug but fails to work with jupyter notebook. The problem in that case is that the python 3.6 kernel cannot connect to the notebook.

Result: R does not work with the CUDA 10 wheel installed via pip - but jupyter notebook does work. and R does work with h2o4gpu-cuda92 but jupyter notebook does not work.

savvyyabby commented 5 years ago

Update ... I found I could get the h2o4gpu-cuda92 install to work with conda provided I downgraded tornado as per the docker instructions:

conda install tornado==4.5.3

Just confirmed that the above install does work with R and the rstudio example:

> library(h2o4gpu)

Attaching package: ‘h2o4gpu’

The following object is masked from ‘package:base’:

    transform

> use_condaenv("h2o4gpu92")
> x <- iris[1:4]
> y <- as.integer(iris$Species) - 1
> model <- h2o4gpu.random_forest_classifier() %>% fit(x, y)
> predictions <- model %>% predict(x)
>

When I reload the R session to use a CUDA 10 version of h2o4gpu from this wheel

h2o4gpu-0.3.2-cp36-cp36m-linux_x86_64.whl

I find:

> library(h2o4gpu)

Attaching package: ‘h2o4gpu’

The following object is masked from ‘package:base’:

    transform

> use_condaenv("h2o4gpuenv")
> x <- iris[1:4]
> y <- as.integer(iris$Species) - 1
> model <- h2o4gpu.random_forest_classifier() %>% fit(x, y)
Error in py_get_attr_impl(x, name, silent) : 
  AttributeError: Coefficients are not defined for Booster type gbtree
> predictions <- model %>% predict(x)
Error in eval(lhs, parent, parent) : object 'model' not found
> 

h2o4gpu-0.3.2-cp36-cp36m-linux_x86_64.whl

savvyyabby commented 5 years ago

This seems to be due to a change in xgboost module sklearn.py.

The install with conda install -c h2oai h2o4gpu-cuda92 uses version 0.80 of xgboost

The latest wheel for h2o4gpu uses the 0.83.dev0 version of xgboost

In the later version of xgboost there is a new XGBModel property coef_ at line 562 in sklearn.py which seems to be the source of the error when running under R.

    @property
    def coef_(self):
        """
        Coefficients property

        .. note:: Coefficients are defined only for linear learners

            Coefficients are only defined when the linear model is chosen as base
            learner (`booster=gblinear`). It is not defined for other base learner types, such
            as tree learners (`booster=gbtree`).

        Returns
        -------
        coef_ : array of shape ``[n_features]`` or ``[n_classes, n_features]``
        """
        if getattr(self, 'booster', None) is not None and self.booster != 'gblinear':
            raise AttributeError('Coefficients are not defined for Booster type {}'
                                 .format(self.booster))
        b = self.get_booster()
        coef = np.array(json.loads(b.get_dump(dump_format='json')[0])['weight'])
        # Logic for multiclass classification
        n_classes = getattr(self, 'n_classes_', None)
        if n_classes is not None:
            if n_classes > 2:
                assert len(coef.shape) == 1
                assert coef.shape[0] % n_classes == 0
                coef = coef.reshape((n_classes, -1))
        return coef

This is not present in the earlier 0.80 version of xgboost.

Will see if I can just downgrade xgboost to version 0.80.

savvyyabby commented 5 years ago

Here is a workaround... use xboost version 0.80

pip install xgboost==0.80

That will work with one GPU in RStudio but not with multigpu.

> library(h2o4gpu)
> #library(xgboost)
> #library(reticulate)
> use_condaenv("h2o4gpuenv")
> #use_condaenv("h2o4gpuenv92")
> # Setup dataset
> x <- iris[1:4]
> y <- as.integer(iris$Species) - 1
> 
> # Initialize and train the classifier
> model <- h2o4gpu.random_forest_classifier(n_gpus=1) %>% fit(x, y)
>         
>             
> # Make predictions
> predictions <- model %>% predict(x)
> 
> # Compute classification error using the Metrics package (note this is training error)
> library(Metrics)
> ce(actual = y, predicted = predictions)
[1] 0.02666667

It seems like the issue is due to changes in xgboost since version 0.80 and 0.83.dev0

https://github.com/dmlc/xgboost/issues/3586

savvyyabby commented 5 years ago

I played with this some more... with CUDA 10 and NCCL 2 I could build xgboost with multi-GPU support for version numbers >= 0.81. However, these have the above breaking change in sklearn.py so that the R examples don't work.

When I tried to build xgboost version 0.80 with CUDA 10 and NCCL the build failed with some compiler errors. I then did a fresh build of xgboost version 0.90 with multi-gpu support.

That seems to work with the notebook examples okay but fails with the R package in the same way.

Incidentally, I tried building the whole h2o4gpu package from source but it was failing. What I found worked was just to pip install the 0.3.2 version of h2o4gpu then manually remove xgboost and replace it with the 0.90 multi-gpu build. That worked except for the R problem.

Here is my final config:

(h2o4gpuenv) grendel@beowulf:~$ pip freeze
apipkg==1.5
astroid==1.6.6
atomicwrites==1.3.0
attrs==19.1.0
backcall==0.1.0
bleach==3.1.0
certifi==2019.6.16
chardet==3.0.4
colorama==0.4.1
coverage==4.5.3
cupy-cuda100==6.1.0
cycler==0.10.0
daal==2019.0
decorator==4.4.0
defusedxml==0.6.0
entrypoints==0.3
execnet==1.6.0
fastrlock==0.4
feather-format==0.4.0
future==0.16.0
h2o==3.18.0.11
h2o4gpu==0.3.2
icc-rt==2019.0
idna==2.8
importlib-metadata==0.18
intel-openmp==2019.0
ipykernel==4.8.2
ipython==6.3.1
ipython-genutils==0.2.0
ipywidgets==6.0.0
isort==4.3.21
jedi==0.14.0
Jinja2==2.10.1
joblib==0.13.2
jsonschema==3.0.1
jupyter==1.0.0
jupyter-client==5.3.1
jupyter-console==5.1.0
jupyter-core==4.4.0
lazy-object-proxy==1.4.1
MarkupSafe==1.1.1
matplotlib==2.0.2
mccabe==0.6.1
mistune==0.8.4
mkl==2019.0
mkl-fft==1.0.6
mkl-random==1.0.1.1
more-itertools==7.1.0
nbconvert==5.5.0
nbformat==4.4.0
notebook==5.7.8
numpy==1.16.4
olefile==0.46
packaging==19.0
pandas==0.24.2
pandocfilters==1.4.2
parso==0.5.0
pexpect==4.7.0
pickleshare==0.7.5
Pillow==4.2.1
pluggy==0.12.0
prometheus-client==0.7.1
prompt-toolkit==1.0.16
psutil==5.4.5
ptyprocess==0.6.0
py==1.8.0
py3nvml==0.2.3
pyarrow==0.13.0
pydaal==2019.0.0.20180713
Pygments==2.4.2
pylint==1.8.4
pymapd==0.12.1
pyparsing==2.4.0
pyrsistent==0.15.3
pytest==3.10.1
pytest-cov==2.5.1
pytest-forked==0.2
pytest-timeout==1.3.3
pytest-xdist==1.22.2
python-dateutil==2.7.2
pytz==2018.4
pyzmq==18.0.2
qtconsole==4.5.1
requests==2.22.0
scikit-learn==0.20.2
scipy==1.2.1
seaborn==0.8.1
Send2Trash==1.5.0
simplegeneric==0.8.1
six==1.12.0
SQLAlchemy==1.3.5
tabulate==0.8.2
tbb==2019.0
tbb4py==2019.0
terminado==0.8.2
testpath==0.4.2
thrift==0.11.0
tornado==6.0.3
traitlets==4.3.2
urllib3==1.25.3
wcwidth==0.1.7
webencodings==0.5.1
widgetsnbextension==2.0.1
wrapt==1.11.2
xgboost==0.90
xmltodict==0.12.0
zipp==0.5.2

Giving up on R for now. I may just use xgboost directly from python.

sh1ng commented 5 years ago

@savvyyabby thank you for your analysis. Bundled xgboost version is compiled with NCCL 2 and as all python tests are passed the problem in R wrapper. I'm working on fixing it.

savvyyabby commented 5 years ago

@sh1ng Thanks for that!

zojeda commented 5 years ago

I did a little hack to the r code wrapping the model:

https://github.com/zojeda/h2o4gpu/commit/448a04ffb277eb16f2c87dc615bca5a1a18a9e4c

you can try with: devtools::install_github("zojeda/h2o4gpu", subdir = "src/interface_r")

but a real solution would be to properly access python model attributes

sh1ng commented 5 years ago

I fixed it in #785 , let's wait CI