Open tym1062 opened 5 years ago
hi @tym1062, take a look at the issue #214
@tym1062 @vilmara this has been escalated and we may have a fix by end of week with nightlies. Thanks so much!!
Currently xgboost doesn't support consuming boolean values from cudf, as it specializes it to bitset according to arrow. Workaround would be converting it to integer or float first. For those who curious about why we can support all data types in pandas, that's because we convert them into float from Python. As GPU memory is quite precious, so I decided not to do the conversation to prevent implicit memory consumption.
Describe the bug The following stack-trace reported after executing cell 14 in NYCTaxi-E2E notebook while using RAPIDS 0.10 container:
Steps/Code to reproduce bug Download
https://github.com/rapidsai/notebooks-contrib/blob/master/intermediate_notebooks/E2E/taxi/NYCTaxi-E2E.ipynb
to/raid
docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 -v /raid:/rapids/data rapidsai/rapidsai:cuda10.1-runtime-ubuntu18.04
Once inside container, start jupyter lab:
cd /rapids; ./utils/start_jupyter.sh
Open browser, specify URL: IP-Address:8888, then open NYCTaxi-E2E notebook. Execute cells in the notebook - after cell 14 is executed, should see the trace-back.
Expected behavior I suspect an extra step will need to be added to the NYCTaxi notebook to update the DMatrix before XGBoost train method is called.
Environment details (please complete the following information):
Here is the conda package list for rapids packages from inside the RAPIDS 0.10 container:
docker pull
&docker run
commands useddocker pull rapidsai/rapidsai:cuda10.1-runtime-ubuntu18.04
docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 \ rapidsai/rapidsai:cuda10.1-runtime-ubuntu18.04
Additional context Add any other context about the problem here.