rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8k stars 867 forks source link

[QST] circular import problem at the beginning of my code #15989

Open blue-cat-whale opened 3 weeks ago

blue-cat-whale commented 3 weeks ago

When I run a Python interpreter and try importing cudf.pandas, it returns an error likely due to circular import.

(cudf) [root@localhost test_cuda]# python3
Python 3.11.5 (main, Sep 22 2023, 15:34:29) [GCC 8.5.0 20210514 (Red Hat 8.5.0-20)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cudf.pandas
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/share/.virtualenvs/cudf/lib64/python3.11/site-packages/cudf/__init__.py", line 9, in <module>
    _setup_numba()
  File "/usr/local/share/.virtualenvs/cudf/lib64/python3.11/site-packages/cudf/utils/_numba.py", line 124, in _setup_numba
    _get_cc_60_ptx_file()
  File "/usr/local/share/.virtualenvs/cudf/lib64/python3.11/site-packages/cudf/utils/_numba.py", line 16, in _get_cc_60_ptx_file
    from cudf._lib import strings_udf
  File "/usr/local/share/.virtualenvs/cudf/lib64/python3.11/site-packages/cudf/_lib/__init__.py", line 4, in <module>
    from . import (
  File "/usr/local/share/.virtualenvs/cudf/lib64/python3.11/site-packages/cudf/_lib/pylibcudf/__init__.py", line 3, in <module>
    from . import (
  File "/usr/local/share/.virtualenvs/cudf/lib64/python3.11/site-packages/cudf/_lib/pylibcudf/strings/__init__.py", line 3, in <module>
    from . import case, find
ImportError: cannot import name 'case' from partially initialized module 'cudf._lib.pylibcudf.strings' (most likely due to a circular import) (/usr/local/share/.virtualenvs/cudf/lib64/python3.11/site-packages/cudf/_lib/pylibcudf/strings/__init__.py)

And this is the status of the package:

(cudf) [root@localhost test_cuda]# pip show cudf-cu12
Name: cudf-cu12
Version: 24.6.0
Summary: cuDF - GPU Dataframe
Home-page:
Author: NVIDIA Corporation
Author-email:
License: Apache 2.0
Location: /usr/local/share/.virtualenvs/cudf/lib64/python3.11/site-packages
Requires: cachetools, cuda-python, cupy-cuda12x, fsspec, numba, numpy, nvtx, packaging, pandas, pyarrow, pynvjitlink-cu12, rich, rmm-cu12, typing_extensions
Required-by:
bdice commented 2 weeks ago

@blue-cat-whale I was not able to reproduce this. I used the following commands:

conda create -n cudf-24.06 python=3.11
conda activate cudf-24.06
pip install --extra-index-url=https://pypi.nvidia.com cudf-cu12==24.6.*

Can you share more about how you installed cudf and whether you can reproduce in a fresh environment? Are you possibly running in a directory with files that cause importing from the local directory instead of from your site-packages directory?

PipGrylls commented 2 weeks ago

I am also getting this error. ImportError: cannot import name 'case' from partially initialized module 'cudf._lib.pylibcudf.strings' (most likely due to a circular import) (/.../.testenv/lib/python3.11/site-packages/cudf/_lib/pylibcudf/strings/__init__.py)

Package Version


cachetools 5.3.3 cuda-python 12.5.0 cudf-cu12 24.6.0 cupy-cuda12x 13.1.0 fastrlock 0.8.2 fsspec 2024.6.0 llvmlite 0.42.0 markdown-it-py 3.0.0 mdurl 0.1.2 numba 0.59.1 numpy 1.26.4 nvtx 0.2.10 packaging 24.1 pandas 2.2.2 pip 23.2.1 pyarrow 16.1.0 Pygments 2.18.0 pynvjitlink-cu12 0.2.4 python-dateutil 2.9.0.post0 pytz 2024.1 rich 13.7.1 rmm-cu12 24.6.0 setuptools 65.5.0 six 1.16.0 typing_extensions 4.12.2 tzdata 2024.1

I’m on a local HPC system with CUDA/12.0.0 python version Python/3.11.5 provided

Matt711 commented 2 weeks ago

Hey @PipGrylls, could you also share more about how you installed cuDF? Did you use a conda command from the installation page?

PipGrylls commented 2 weeks ago

Hi, I used pip install cudf-cu12

Matt711 commented 2 weeks ago

Okay I was able to reproduce with a fresh python 3.11 environment and running pip install cudf-cu12. @PipGrylls Could you try the following in a fresh environment?

pip install \
    --extra-index-url=https://pypi.nvidia.com \
    cudf-cu12==24.6.*
blue-cat-whale commented 2 weeks ago

Okay I was able to reproduce with a fresh python 3.11 environment and running pip install cudf-cu12. @PipGrylls Could you try the following in a fresh environment?

pip install \
    --extra-index-url=https://pypi.nvidia.com \
    cudf-cu12==24.6.*

It seems the old package pip install cudf-cu12 -U is problematic. After adding the --extra-index-url option, the cicular problem is gone.

PipGrylls commented 2 weeks ago

Hi that seems to have worked, with a couple of caveats.

It required installing the package whilst having the GPU node allocation. Probably because it needs to find and query the device during installation. But for HPC the envronment may want to be loaded elsewhere.

I assume because of the same reason I needed to manually remove rmm-cu12 and reinstall it with the —extra-index-url pointing to nvidia also while on the node with the GPU.

Let me know if you need more info and thanks for your help!

wence- commented 2 weeks ago

Thanks all. pip install cudf-cu12 (and pip install rmm-cu12) should work, and there is a magic "nvidia-stub" library that is installed from pypi.org that should handle forwarding to pypi.nvidia.com for the relevant packages. If you just run pip install cudf-cu12 (say), then the first thing that happens is a placeholder package is downloaded that has a bunch of dependencies that are supposed to pull things to the right place.

However, there is a bug in the nvidia-stub library that means that, when running using python 3.11, it incorrectly ends up downloading wheels that were compiled for python 3.10 (even though 3.11 wheels exist).

When you then come to try and import things, this fails because although the package installed, the compiled parts of it are not compatible with your python version.

We'll look at fixing this in the nvidia-stub library.

wence- commented 2 weeks ago

It required installing the package whilst having the GPU node allocation. Probably because it needs to find and query the device during installation. But for HPC the envronment may want to be loaded elsewhere.

@PipGrylls can you describe exactly what you mean by this? I know installation on HPC systems is always much more painful than on a local workstation, but we would like to make it as easy as possible.