rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.32k stars 887 forks source link

After install in WSL2, can import cudf but cudf.pandas not found #14410

Closed eafpres closed 10 months ago

eafpres commented 10 months ago

Describe the bug I followed the instructions for WSL2, and it installed fine, but in Python it says

eafpres@EAFLLCML:~$ python
Python 3.9.18 (main, Aug 25 2023, 13:20:04)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cudf.pandas
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'cudf.pandas'

Note that the cudf install verifies:

eafpres@EAFLLCML:~$ python
Python 3.9.18 (main, Aug 25 2023, 13:20:04)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cudf
>>> print(cudf.Series([1, 2, 3]))
0    1
1    2
2    3
dtype: int64

Steps/Code to reproduce bug I followed these instructions using an existing GPU-enabled WSL2 instance: https://docs.rapids.ai/install?_gl=1*1ts9z7g*_ga*MTExNTU2MDg3OC4xNjk5OTAyNDUx*_ga_RKXFW6CM42*MTY5OTk4NjAxMC4zLjAuMTY5OTk4NjAxMy41Ny4wLjA.#wsl2-pip

Expected behavior I thought that cudf.pandas would be available in python

Environment overview (please complete the following information)

Environment details Please run and paste the output of the cudf/print_env.sh script here, to gather any other relevant environment details: where would I find this file in WSL2?

Additional context Add any other context about the problem here.

bdice commented 10 months ago

Hi @eafpres, please see the docs on how to use cudf.pandas. https://docs.rapids.ai/api/cudf/stable/cudf_pandas/

Rather than writing import cudf.pandas, you need to write %load_ext cudf.pandas in a notebook, or pass python -m cudf.pandas my_script.py on the command line. Then just use import pandas as pd, and the pandas module will be accelerated automatically.

If you are still having trouble, please verify that your cudf version is 23.10.1 or newer.

eafpres commented 10 months ago

Rather than writing import cudf.pandas, you need to write %load_ext cudf.pandas in a notebook, or pass python -m cudf.pandas my_script.py on the command line. Then just use import pandas as pd, and the pandas module will be accelerated automatically.

This page (nvidia rapids page) says you can import in a script: https://rapids.ai/cudf-pandas/ "Or, explicitly enable cudf.pandas via import if you can't use command line flags:

import cudf.pandas cudf.pandas.install()

import pandas as pd ..."

If you are still having trouble, please verify that your cudf version is 23.10.1 or newer.

I just did the pip install today, but the version is 23.06.01. Here is the install command:

pip install \
    --extra-index-url=https://pypi.nvidia.com \
    cudf-cu11
bdice commented 10 months ago

Yes, you're correct about that being a third option for enabling cudf.pandas (it requires more changes to the code so I typically don't recommend it first). However, you'll need a newer version of cudf (at least 23.10.1, or a nightly version of 23.12) to make this work. You might need to update your version of pip -- we have seen a couple users report issues with older pip versions. If that doesn't help, can you share the output of pip list?

eafpres commented 10 months ago

Yes, you're correct about that being a third option for enabling cudf.pandas (it requires more changes to the code so I typically don't recommend it first).

Actually this is recommended on the main page as a no-code change option.

However, you'll need a newer version of cudf (at least 23.10.1, or a nightly version of 23.12) to make this work. You might need to update your version of pip -- we have seen a couple users report issues with older pip versions. If that doesn't help, can you share the output of pip list?

I updated pip from 23..2 to 23.3.1, but when I re-ran the install, the version of cudf stays the same, and there is no cudf.pandas

/usr/bin/python: No module named cudf.pandas```

eafpres@EAFLLCML:~$ pip list Package Version


absl-py 1.4.0 aiohttp 3.8.4 aiohttp-cors 0.7.0 aiorwlock 1.3.0 aiosignal 1.3.1 alabaster 0.7.13 alembic 1.11.1 ansi2html 1.8.0 anyio 3.6.2 appdirs 1.4.4 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 array-record 0.2.0 arrow 1.2.3 asgiref 3.7.2 astor 0.8.1 astroid 2.15.5 asttokens 2.2.1 astunparse 1.6.3 async-timeout 4.0.2 atom-ml 5.2.0 atomicwrites 1.4.1 attrs 23.1.0 autograd 1.5 autograd-gamma 0.5.0 Automat 0.8.0 autopep8 2.0.2 Babel 2.12.1 backcall 0.2.0 beautifulsoup4 4.12.2 bidict 0.22.1 binaryornot 0.4.4 black 23.3.0 bleach 6.0.0 blessed 1.20.0 blinker 1.4 blis 0.7.9 boto3 1.28.29 botocore 1.31.29 Brotli 1.0.9 cachetools 5.3.0 catalogue 2.0.8 category-encoders 2.6.1 certifi 2020.4.5.2 cffi 1.15.1 chardet 3.0.4 charset-normalizer 3.1.0 cheap-repr 0.5.1 chrome-gnome-shell 0.0.0 cleanlab 2.4.0 click 8.1.3 cloud-init 23.2.2 cloudpickle 2.2.1 cmaes 0.9.1 colorama 0.4.3 colorful 0.5.5 colorlog 6.7.0 comm 0.1.3 command-not-found 0.3 commonmark 0.9.1 confection 0.0.4 configobj 5.0.6 constantly 15.1.0 contourpy 1.0.7 cookiecutter 2.1.1 coverage 7.2.7 cryptography 2.8 cubinlinker-cu11 0.3.0.post1 cuda-python 11.8.2 cudf-cu11 23.6.1 cudf-cu12 23.6.1 cuml-cu11 23.6.0 cuml-cu12 23.6.0 cupshelpers 1.0 cupy-cuda11x 12.2.0 cycler 0.11.0 cymem 2.0.7 Cython 0.29.33 daal 2023.2.0 daal4py 2023.2.0 dagshub 0.2.10 dash 2.11.1 dash-bootstrap-components 1.4.1 dash-core-components 2.0.0 dash-daq 0.5.0 dash-html-components 2.0.0 dash-renderer 1.8.3 dash-table 5.0.0 dask 2023.3.2 dask-cuda 23.6.0 dask-cudf-cu11 23.6.0 dask-cudf-cu12 23.6.0 databricks-cli 0.17.7 dbus-python 1.2.16 debugpy 1.6.7 decaf-synthetic-data 0.1.6 decorator 4.4.2 defer 1.0.6 defusedxml 0.7.1 diff-match-patch 20230430 dill 0.3.6 distlib 0.3.7 distributed 2023.3.2.1 distro 1.4.0 dm-tree 0.1.8 docker 6.1.3 docstring-to-markdown 0.12 docutils 0.20.1 entrypoints 0.3 et-xmlfile 1.1.0 etils 1.3.0 exceptiongroup 1.1.1 executing 1.2.0 fastai 2.7.12 fastapi 0.100.0 fastcore 1.5.29 fastdownload 0.0.7 fastjsonschema 2.16.3 fastprogress 1.0.3 fastrlock 0.8.1 fasttreeshap 0.1.6 feather-format 0.4.1 featuretools 1.26.0 fflows 0.0.3 filelock 3.12.2 flake8 6.0.0 Flask 2.2.5 Flask-Cors 4.0.0 flatbuffers 23.5.9 fonttools 4.39.4 formulaic 0.6.1 frozenlist 1.3.3 fsspec 2023.6.0 fusepy 3.0.1 future 0.18.3 gast 0.4.0 geomloss 0.2.6 git-python 1.0.3 gitdb 4.0.10 GitPython 3.1.32 google-api-core 2.11.1 google-auth 2.18.0 google-auth-oauthlib 1.0.0 google-cloud-core 2.3.3 google-cloud-storage 2.10.0 google-crc32c 1.5.0 google-pasta 0.2.0 google-resumable-media 2.5.0 googleapis-common-protos 1.59.0 gplearn 0.4.2 gpustat 1.1 greenlet 2.0.2 grpcio 1.51.3 gunicorn 20.1.0 h11 0.12.0 h5py 3.8.0 holidays 0.29 httpcore 0.14.7 httplib2 0.14.0 httptools 0.6.0 httpx 0.22.0 hyperlink 19.0.0 idna 2.9 imagesize 1.4.1 imbalanced-learn 0.10.1 importlib-metadata 6.6.0 importlib-resources 5.12.0 incremental 16.10.1 inflate64 0.3.1 inflection 0.5.1 iniconfig 2.0.0 intel-openmp 2023.2.0 interface-meta 1.3.0 intervaltree 3.1.0 ipykernel 6.23.1 ipython 8.13.2 ipython_genutils 0.2.0 ipywidgets 8.0.6 isort 5.12.0 itsdangerous 2.1.2 jax 0.4.10 jedi 0.18.2 jeepney 0.8.0 jellyfish 0.11.2 Jinja2 3.1.2 jinja2-time 0.2.0 jmespath 1.0.1 joblib 1.2.0 jsonpatch 1.22 jsonpointer 2.0 jsonschema 3.2.0 jupyter 1.0.0 jupyter_client 8.2.0 jupyter-console 6.6.3 jupyter_core 5.3.0 jupyter-events 0.6.3 jupyter_server 2.5.0 jupyter_server_terminals 0.4.4 jupyterlab-pygments 0.2.2 jupyterlab-widgets 3.0.7 keopscore 2.1.2 keras 2.12.0 keras-cv 0.5.0 keyring 21.2.1 kiwisolver 1.4.4 kubernetes 27.2.0 langcodes 3.3.0 language-selector 0.1 launchpadlib 1.10.13 lazr.restfulclient 0.14.2 lazr.uri 1.0.3 lazy-object-proxy 1.9.0 libclang 16.0.0 lifelines 0.27.7 lightning-utilities 0.8.0 llvmlite 0.40.1rc1 locket 1.0.0 loguru 0.7.0 lowess 1.0.3 macaroonbakery 1.3.1 Mako 1.2.4 Markdown 3.4.3 MarkupSafe 2.1.2 matplotlib 3.7.1 matplotlib-inline 0.1.6 mccabe 0.7.0 mistune 2.0.5 mkl 2023.2.0 mkl-service 2.4.0 ml-dtypes 0.1.0 mlflow 2.5.0 modin 0.22.3 monai 1.2.0 more-itertools 4.2.0 msgpack 1.0.5 multidict 6.0.4 multivolumefile 0.2.3 murmurhash 1.0.9 mypy-extensions 1.0.0 natsort 8.4.0 nbclassic 1.0.0 nbclient 0.7.4 nbconvert 7.4.0 nbformat 5.8.0 nest-asyncio 1.5.6 netifaces 0.10.4 networkx 2.8.8 nflows 0.14 nltk 3.8.1 notebook 6.5.4 notebook_shim 0.2.3 numba 0.57.0 numpy 1.23.5 numpydoc 1.5.0 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-ml-py 11.525.112 nvitop 1.1.2 nvtx 0.2.6 oauthlib 3.2.2 opacus 1.4.0 opencensus 0.11.2 opencensus-context 0.1.3 opencv-python 4.8.0.76 openpyxl 3.1.2 opt-einsum 3.3.0 optuna 3.2.0 outcome 1.2.0 packaging 20.4 pandas 1.5.3 pandocfilters 1.5.0 parso 0.8.3 partd 1.4.0 pathspec 0.11.1 pathy 0.10.2 patsy 0.5.3 pexpect 4.6.0 pgmpy 0.1.22 pickleshare 0.7.5 Pillow 9.5.0 pip 23.3.1 pkginfo 1.5.0.1 platformdirs 3.5.1 plotext 5.2.8 plotly 5.15.0 pluggy 1.0.0 pprintpp 0.4.0 preshed 3.0.8 prometheus-client 0.16.0 promise 2.3 prompt-toolkit 3.0.38 protobuf 4.21.12 psutil 5.9.5 psycopg2-binary 2.9.6 ptxcompiler-cu11 0.7.0.post1 ptyprocess 0.7.0 pure-eval 0.2.2 py-spy 0.3.14 py7zr 0.20.5 pyarrow 11.0.0 pyasn1 0.4.2 pyasn1-modules 0.2.1 pybcj 1.0.1 pybind11 2.10.4 pycairo 1.16.2 pycodestyle 2.10.0 pycox 0.2.3 pycparser 2.21 pycryptodomex 3.18.0 pycups 1.9.73 pydantic 1.10.9 pydocstyle 6.3.0 pyflakes 3.0.1 Pygments 2.15.1 PyGObject 3.36.0 PyHamcrest 1.9.0 PyJWT 1.7.1 pykeops 2.1.2 pylibraft-cu11 23.6.2 pylibraft-cu12 23.6.2 pylint 2.17.4 pylint-venv 3.0.1 pyls-spyder 0.4.0 pymacaroons 0.13.0 pymatrix-rain 1.2.0 PyNaCl 1.3.0 pynndescent 0.5.10 pynvml 11.4.1 pyOpenSSL 19.0.0 pyparsing 3.0.9 pyppmd 1.0.0 PyQt5 5.15.9 PyQt5-Qt5 5.15.2 PyQt5-sip 12.12.1 PyQt6 6.5.0 PyQt6-Qt6 6.5.0 PyQt6-sip 13.5.1 PyQtWebEngine 5.14.0 pyRFC3339 1.1 pyrsistent 0.15.5 pyserial 3.4 PySide2 5.15.2.1 PySide6 6.5.0 PySide6-Addons 6.5.0 PySide6-Essentials 6.5.0 PySocks 1.7.1 pytest 7.3.2 pytest-cov 4.1.0 python-apt 2.0.1+ubuntu0.20.4.1 python-dateutil 2.8.2 python-dotenv 1.0.0 python-engineio 4.5.1 python-json-logger 2.0.7 python-lsp-black 1.2.1 python-lsp-jsonrpc 1.0.0 python-lsp-server 1.7.3 python-magic 0.4.27 python-slugify 8.0.1 python-socketio 5.8.0 pytoolconfig 1.2.5 pytorch-lightning 1.9.5 pyts 0.13.0 pytz 2023.3 pyxdg 0.28 PyYAML 6.0 pyzmq 25.0.2 pyzstd 0.15.7 QDarkStyle 3.1 qstylizer 0.2.2 QtAwesome 1.2.3 qtconsole 5.4.3 QtPy 2.3.1 querystring-parser 1.2.4 raft-dask-cu11 23.6.2 raft-dask-cu12 23.6.2 ray 2.5.1 readme-renderer 26.0 redis 4.5.5 regex 2023.5.5 requests 2.31.0 requests-oauthlib 1.3.1 requests-unixsocket 0.2.0 retrying 1.3.4 rfc3339-validator 0.1.4 rfc3986 1.5.0 rfc3986-validator 0.1.1 rich 10.2.2 rmm-cu11 23.6.0 rmm-cu12 23.6.0 rope 1.8.0 rsa 4.9 Rtree 1.0.1 s3transfer 0.6.2 scikit-learn 1.2.2 scikit-learn-intelex 2023.2.0 scipy 1.10.1 screen-resolution-extra 0.0.0 screeninfo 0.8.1 SecretStorage 3.3.3 selenium 4.10.0 sematic 0.34.0 semver 2.13.0 Send2Trash 1.8.2 service-identity 18.1.0 setuptools 67.7.2 shap 0.41.0 shapash 2.3.4 shapely 2.0.1 shiboken2 5.15.2.1 shiboken6 6.5.0 simplejson 3.16.0 sip 4.19.21 six 1.15.0 slicer 0.0.7 smart-open 6.3.0 smmap 5.0.0 sniffio 1.3.0 snoop 0.4.3 snowballstemmer 2.2.0 sortedcontainers 2.4.0 sos 4.5.6 soupsieve 2.4.1 spacy 3.5.3 spacy-legacy 3.0.12 spacy-loggers 1.0.4 spb-cli 0.18.0 Sphinx 7.0.1 sphinxcontrib-applehelp 1.0.4 sphinxcontrib-devhelp 1.0.2 sphinxcontrib-htmlhelp 2.0.1 sphinxcontrib-jsmath 1.0.1 sphinxcontrib-qthelp 1.0.3 sphinxcontrib-serializinghtml 1.1.5 spyder 5.4.3 spyder-kernels 2.4.3 SQLAlchemy 1.4.49 SQLAlchemy-Utils 0.41.1 sqlparse 0.4.4 srsly 2.4.6 ssh-import-id 5.10 stack-data 0.6.2 starlette 0.27.0 statsmodels 0.14.0 synthcity 0.2.6 systemd-python 234 tabulate 0.9.0 tbb 2021.10.0 tblib 2.0.0 tenacity 8.2.2 tensorboard 2.12.3 tensorboard-data-server 0.7.0 tensorflow 2.12.0 tensorflow-addons 0.20.0 tensorflow-datasets 4.9.2 tensorflow-estimator 2.12.0 tensorflow-io-gcs-filesystem 0.32.0 tensorflow-metadata 1.13.1 termcolor 2.3.0 terminado 0.17.1 text-unidecode 1.3 textdistance 4.5.0 texttable 1.6.7 thinc 8.1.10 threadpoolctl 3.1.0 three-merge 0.1.1 tinycss2 1.2.1 toml 0.10.2 tomli 2.0.1 tomlkit 0.11.8 toolz 0.12.0 torch 1.13.1 torchmetrics 0.11.4 torchtext 0.14.1 torchtuples 0.2.2 torchvision 0.14.1 tornado 6.3.2 tqdm 4.66.1 traitlets 5.9.0 treelite 3.2.0 treelite-runtime 3.2.0 trio 0.22.2 trio-websocket 0.10.3 tsai 0.3.6 Twisted 18.9.0 typeguard 2.13.3 typer 0.7.0 typing_extensions 4.8.0 tzdata 2023.3 ubuntu-advantage-tools 8001 ucx-py-cu11 0.32.0 ucx-py-cu12 0.32.0 ufw 0.36 ujson 5.7.0 umap-learn 0.5.3 urllib3 1.26.16 uvicorn 0.23.1 uvloop 0.17.0 virtualenv 20.21.0 wadllib 1.3.3 wasabi 1.1.2 watchdog 3.0.0 watchfiles 0.19.0 wcwidth 0.2.6 webencodings 0.5.1 websocket-client 1.5.1 websockets 11.0.3 Werkzeug 2.2.3 whatthepatch 1.0.5 wheel 0.34.2 widgetsnbextension 4.0.7 woodwork 0.25.1 wrapt 1.14.1 wsproto 1.2.0 wurlitzer 3.0.3 xgboost 1.7.5 xgbse 0.2.3 xicor 1.0.1 xkit 0.0.0 yapf 0.33.0 yarl 1.9.2 zict 3.0.0 zipp 3.15.0 zoofs 0.1.26 zope.interface 4.7.1

shwina commented 10 months ago

Thanks for your patience, and apologies for the trouble @eafpres.

It's kind of hard to tell exactly what is preventing the latest cudf from being picked up but I suspect that it's a dependency pinning issue. For example, one of the packages in your environment has a pinning on a dependency that is incompatible with cudf 23.10.1.

Could you please try one more set of commands and post the output of the second pip install ... command? That should tell us more:

pip uninstall cudf-cu11 cudf-cu12 cuml-cu11 cuml-cu12
pip install cudf-cu11==23.10.1 --extra-index-url=https://pypi.nvidia.com
eafpres commented 10 months ago

Could you please try one more set of commands and post the output of the second pip install ... command? That should tell us more

That fixed it. Originally I had a bunch of dependency errors, so I went back and re-did some of the cuda things, then got cudf to install, but was stuck here. Thank's for the help!

eafpres@EAFLLCML:~$ python
Python 3.9.18 (main, Aug 25 2023, 13:20:04)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cudf.pandas
>>> cudf.pandas.install()
>>> import pandas as pd
>>>