NVIDIA-Merlin / Transformers4Rec

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation and works with PyTorch.
https://nvidia-merlin.github.io/Transformers4Rec/main
Apache License 2.0
1.07k stars 142 forks source link

[BUG] conda env import error cudf #744

Closed dcy0577 closed 8 months ago

dcy0577 commented 10 months ago

Bug description

After installing the conda env according to README, AttributeError: module 'torch._C' has no attribute '_cuda_customAllocator'. Did you mean: '_cuda_CUDAAllocator'? poped up when import cudf

Steps/Code to reproduce bug

  1. conda create -n t4rec -c nvidia -c rapidsai -c pytorch -c conda-forge \
    transformers4rec=23.04 `# NVIDIA Merlin` \
    nvtabular=23.04 `# NVIDIA Merlin - Used in example notebooks` \
    python=3.10 `# Compatible Python environment` \
    cudf=23.02 `# RAPIDS cuDF - GPU accelerated DataFrame` \
    cudatoolkit=11.8 pytorch-cuda=11.8 `# NVIDIA CUDA version`
  2. In terminal, after activate env:
    $ python
    Python 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import cudf
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/home//miniconda3/envs/t4rec/lib/python3.10/site-packages/cudf/__init__.py", line 5, in <module>
    validate_setup()
    File "/home//miniconda3/envs/t4rec/lib/python3.10/site-packages/cudf/utils/gpu_utils.py", line 20, in validate_setup
    from rmm._cuda.gpu import (
    File "/home//miniconda3/envs/t4rec/lib/python3.10/site-packages/rmm/__init__.py", line 20, in <module>
    from rmm.rmm import (
    File "/home//miniconda3/envs/t4rec/lib/python3.10/site-packages/rmm/rmm.py", line 248, in <module>
    rmm_torch_allocator = CUDAPluggableAllocator(
    File "/home//miniconda3/envs/t4rec/lib/python3.10/site-packages/torch/cuda/memory.py", line 712, in __init__
    self._allocator = torch._C._cuda_customAllocator(alloc_fn, free_fn)
    AttributeError: module 'torch._C' has no attribute '_cuda_customAllocator'. Did you mean: '_cuda_CUDAAllocator'?

Expected behavior

The error should not appear

Environment details

Additional context

rnyak commented 10 months ago

@dcy0577 you need a compatible GPU and properly installed cuda driver to be able to import and use cudf library. what's the output of nvidia-smi and nvcc --version on your machine? can you please share?

dcy0577 commented 8 months ago

solved the problem using t4rec 23.06