BlinkDL / ChatRWKV

ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
Apache License 2.0
9.42k stars 696 forks source link

FileNotFoundError: 'torch_extensions\torch_extensions\Cache\py310_cu118\wkv_cuda\wkv_cuda.pyd' #99

Closed linonetwo closed 1 year ago

linonetwo commented 1 year ago
PS E:\repo\ChatRWKV> python E:\repo\ChatRWKV\v2\chat.py

ChatRWKV v2 https://github.com/BlinkDL/ChatRWKV

English - cuda fp16i8 *30 -> cuda fp16 - E:\repo\ChatRWKV\v2/prompt/default/English-2.py
Using C:\Users\Administrator\AppData\Local\torch_extensions\torch_extensions\Cache\py310_cu118 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file C:\Users\Administrator\AppData\Local\torch_extensions\torch_extensions\Cache\py310_cu118\wkv_cuda\build.ninja...
Building extension module wkv_cuda...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module wkv_cuda...
C:\Users\Administrator\AppData\Local\torch_extensions\torch_extensions\Cache\py310_cu118\wkv_cuda\wkv_cuda.pyd
Traceback (most recent call last):
  File "E:\repo\ChatRWKV\v2\chat.py", line 99, in <module>
    from rwkv.model import RWKV
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\rwkv\model.py", line 34, in <module>
    load(
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\cpp_extension.py", line 1284, in load
    return _jit_compile(
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\cpp_extension.py", line 1536, in _jit_compile
    return _import_module_from_library(name, build_directory, is_python_module)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\cpp_extension.py", line 1935, in _import_module_from_library
    torch.ops.load_library(filepath)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_ops.py", line 644, in load_library
    ctypes.CDLL(path)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\ctypes\__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
FileNotFoundError: Could not find module 'C:\Users\Administrator\AppData\Local\torch_extensions\torch_extensions\Cache\py310_cu118\wkv_cuda\wkv_cuda.pyd' (or one of its dependencies). Try using the full path with constructor syntax.

The error

FileNotFoundError: Could not find module 'C:\Users\Administrator\AppData\Local\torch_extensions\torch_extensions\Cache\py310_cu118\wkv_cuda\wkv_cuda.pyd' (or one of its dependencies). Try using the full path with constructor syntax.

should not happened, because wkv_cuda.pyd is there

PS E:\repo\ChatRWKV> ls C:\Users\Administrator\AppData\Local\torch_extensions\torch_extensions\Cache\py310_cu118\wkv_cuda\wkv_cuda.pyd

    Directory: C:\Users\Administrator\AppData\Local\torch_extensions\torch_extensions\Cache\py310_cu118\wkv_cuda

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a---           2023/4/21    10:06         434688 wkv_cuda.pyd

The error is caused by

if _os.name == "nt":
    from _ctypes import LoadLibrary as _dlopen

# ...
        if handle is None:
            self._handle = _dlopen(self._name, mode)

This LoadLibrary is buggy in windows when loading pyd file.

linonetwo commented 1 year ago

Solved by install cuda 11.7.1 instead of 12

https://developer.nvidia.com/cuda-11-7-1-download-archive?target_os=Windows&target_arch=x86_64&target_version=10&target_type=exe_network

And reinstall torch

 pip install --force-reinstall -v "torch==1.13.1" "torchaudio==0.13.1" "torchvision==0.14.1"  --extra-index-url https://download.pytorch.org/whl/cu117
linonetwo commented 1 year ago

After upgrade to win11

PS E:\repo\ChatRWKV> pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple, https://download.pytorch.org/whl/cu117
ERROR: Could not find a version that satisfies the requirement torch==1.13.1+cu117 (from versions: 2.0.0, 2.0.0+cu117)
ERROR: No matching distribution found for torch==1.13.1+cu117

Always saying No matching distribution found for torch==1.13.1+cu117

And running chat.py says No CUDA runtime is found

PS E:\repo\ChatRWKV> python v2\chat.py

ChatRWKV v2 https://github.com/BlinkDL/ChatRWKV

English - cuda fp16i8 *20 -> cuda fp16 - E:\repo\ChatRWKV\v2/prompt/default/English-2.py
No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7'
Using C:\Users\linonetwo\AppData\Local\torch_extensions\torch_extensions\Cache\py311_cpu as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file C:\Users\linonetwo\AppData\Local\torch_extensions\torch_extensions\Cache\py311_cpu\wkv_cuda\build.ninja...
Traceback (most recent call last):
  File "E:\repo\ChatRWKV\v2\chat.py", line 99, in <module>
    from rwkv.model import RWKV
  File "E:\repo\ChatRWKV\v2/../rwkv_pip_package/src\rwkv\model.py", line 29, in <module>
    load(
  File "C:\Users\linonetwo\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 1284, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "C:\Users\linonetwo\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 1509, in _jit_compile
    _write_ninja_file_and_build_library(
  File "C:\Users\linonetwo\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 1611, in _write_ninja_file_and_build_library
    _write_ninja_file_to_build_library(
  File "C:\Users\linonetwo\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 2007, in _write_ninja_file_to_build_library
    cuda_flags = common_cflags + COMMON_NVCC_FLAGS + _get_cuda_arch_flags()
                                                     ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\linonetwo\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 1773, in _get_cuda_arch_flags
    arch_list[-1] += '+PTX'
    ~~~~~~~~~^^^^
IndexError: list index out of range
linonetwo commented 1 year ago

Try install python 3.10 instead of latest 3.11

https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/7166#issuecomment-1402265851

SeekPoint commented 1 year ago

After upgrade to win11

PS E:\repo\ChatRWKV> pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple, https://download.pytorch.org/whl/cu117
ERROR: Could not find a version that satisfies the requirement torch==1.13.1+cu117 (from versions: 2.0.0, 2.0.0+cu117)
ERROR: No matching distribution found for torch==1.13.1+cu117

Always saying No matching distribution found for torch==1.13.1+cu117

And running chat.py says No CUDA runtime is found

PS E:\repo\ChatRWKV> python v2\chat.py

ChatRWKV v2 https://github.com/BlinkDL/ChatRWKV

English - cuda fp16i8 *20 -> cuda fp16 - E:\repo\ChatRWKV\v2/prompt/default/English-2.py
No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7'
Using C:\Users\linonetwo\AppData\Local\torch_extensions\torch_extensions\Cache\py311_cpu as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file C:\Users\linonetwo\AppData\Local\torch_extensions\torch_extensions\Cache\py311_cpu\wkv_cuda\build.ninja...
Traceback (most recent call last):
  File "E:\repo\ChatRWKV\v2\chat.py", line 99, in <module>
    from rwkv.model import RWKV
  File "E:\repo\ChatRWKV\v2/../rwkv_pip_package/src\rwkv\model.py", line 29, in <module>
    load(
  File "C:\Users\linonetwo\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 1284, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "C:\Users\linonetwo\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 1509, in _jit_compile
    _write_ninja_file_and_build_library(
  File "C:\Users\linonetwo\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 1611, in _write_ninja_file_and_build_library
    _write_ninja_file_to_build_library(
  File "C:\Users\linonetwo\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 2007, in _write_ninja_file_to_build_library
    cuda_flags = common_cflags + COMMON_NVCC_FLAGS + _get_cuda_arch_flags()
                                                     ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\linonetwo\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 1773, in _get_cuda_arch_flags
    arch_list[-1] += '+PTX'
    ~~~~~~~~~^^^^
IndexError: list index out of range

got the same issue with python 3.11

LindiaC commented 10 months ago

@SeekPoint It's because the arch list is empty. For example, I got a V100, and its compatibility is 7.0
(The compatibility can be learnt here ) Then run export TORCH_CUDA_ARCH_LIST="7.0+PTX" first.