rapidsai-community / notebooks-contrib

RAPIDS Community Notebooks
Apache License 2.0
513 stars 267 forks source link

ModuleNotFoundError: No module named 'pyarrow._cuda' when running rapids-colab.ipynb [BUG] #348

Open gianlumessi opened 2 years ago

gianlumessi commented 2 years ago

Describe the bug When running the rapids-colab.ipynb as is (found here: https://colab.research.google.com/drive/1rY7Ln6rEE1pOlfSHCYOVaqt8OvDO35J0#forceEdit=true&offline=true&sandboxMode=true), I get: ModuleNotFoundError: No module named 'pyarrow._cuda'. The code error arises when doing: import cudf and import cuml (last cells in the notebook)

Steps/Code to reproduce bug I am running the "rapids-colab.ipynb" notebook as is in Colab

Expected behavior Successfully importing cuml and cudf packages.

Environment details (please complete the following information):

Additional context Error log

ModuleNotFoundError Traceback (most recent call last) in () ----> 1 import cudf 2 import io, requests 3 4 # download CSV file from GitHub 5 url="https://github.com/plotly/datasets/raw/master/tips.csv"

2 frames /usr/local/lib/python3.7/site-packages/cudf/_lib/init.py in () 2 import numpy as np 3 ----> 4 from . import ( 5 avro, 6 binaryop,

cudf/_lib/gpuarrow.pyx in init cudf._lib.gpuarrow()

ModuleNotFoundError: No module named 'pyarrow._cuda'

vincentmele commented 2 years ago

I am also getting exactly the same error in Colab using the same setup.

taureandyernv commented 2 years ago

Colab made some breaking changes a few weeks ago that we are still figuring out.. We weren't the only packages affected by it. Today, as Colab doesn't support the latest RAPIDS stable, we recommend transitioning to SageMaker Studio Lab. A notice will go out soon, along with a blog, but if you'd like, please try out the capability here:

https://github.com/rapidsai-community/rapids-smsl

You will need a SageMaker Studip Lab account to begin. All that information is in the readme. Hope this helps.

vincentmele commented 2 years ago

Indeed this issue continues with the "new" issue in #10187, The link that was provided has a fix for this bug that resolves the issue for me. I run it immediately after the cell that starts with !python rapidsai-csp-utils/colab/install_rapids.py stable and before the text cell that says "Now you can run code!"

Posted here.

import sys

# clear Pandas and PyArrow from the module cache
# and force them to be reloaded on import.
# WARNING: I don't know what else this might break.
# Ideally, none of this should be in the module cache
# in the first place.

mods = [mod for mod in sys.modules if mod.startswith(("pandas", "pyarrow"))]
for mod in mods:
  del sys.modules[mod]

Note that you'll need to make sure you start with a fresh runtime. Thus:

  1. click Runtime -> Restart Runtime
  2. Then execute the code I posted above
  3. Only after that, import cudf

Originally posted by @shwina in https://github.com/rapidsai/cudf/issues/10187#issuecomment-1031681291

taureandyernv commented 2 years ago

updating the thread, i've been working on nd off on a new Colab Script. RAPIDS installs, but

importing cudf or anything RAPIDS yields error with cuda.cudart,

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
[<ipython-input-12-46fcec189046>](https://localhost:8080/#) in <module>
      1 # won't work until you copy the library files over.  here to show as an example
----> 2 import cudf

1 frames
[/usr/local/lib/python3.8/site-packages/cudf/utils/gpu_utils.py](https://localhost:8080/#) in validate_setup()
     16     import warnings
     17 
---> 18     from cuda.cudart import cudaDeviceAttr, cudaError_t
     19 
     20     from rmm._cuda.gpu import (

ModuleNotFoundError: No module named 'cuda.cudart'

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

It's not actually there, even after copying the libraries, even though it has the same library version as the docker containers, which does have cudart.

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
[<ipython-input-14-2017f985906f>](https://localhost:8080/#) in <module>
----> 1 from cuda import cudart
      2 # also tried to find it in the library code itself.  It's not there.

ImportError: cannot import name 'cudart' from 'cuda' (/usr/local/lib/python3.8/site-packages/cuda/__init__.py)

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------