turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.2k stars 235 forks source link

Issue when Rocm and Cuda on windows #345

Open sorasoras opened 4 months ago

sorasoras commented 4 months ago
 python examples/chat.py C:\model\sparsetral-16x7B-v2-SPIN_iter1-exl2-6.5\ -p "Once upon a time,"
No ROCm runtime is found, using ROCM_HOME='C:\Program Files\AMD\ROCm\5.7'
Traceback (most recent call last):
  File "C:\model\exllamav2\examples\chat.py", line 5, in <module>
    from exllamav2 import(
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\exllamav2\__init__.py", line 3, in <module>
    from exllamav2.model import ExLlamaV2
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\exllamav2\model.py", line 25, in <module>
    from exllamav2.config import ExLlamaV2Config
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\exllamav2\config.py", line 2, in <module>
    from exllamav2.fasttensors import STFile
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\exllamav2\fasttensors.py", line 5, in <module>
    from exllamav2.ext import exllamav2_ext as ext_c
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\exllamav2\ext.py", line 2, in <module>
    from torch.utils.cpp_extension import load
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 205, in <module>
    HIP_HOME = _join_rocm_home('hip') if ROCM_HOME else None
               ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 158, in _join_rocm_home
    raise OSError('Building PyTorch extensions using '
OSError: Building PyTorch extensions using ROCm and Windows is not supported.
PS C:\model\exllamav2> python examples/chat.py C:\model\sparsetral-16x7B-v2-SPIN_iter1-exl2-6.5\ -p "Once upon a time,"
No ROCm runtime is found, using ROCM_HOME='C:\Program Files\AMD\ROCm\5.7'
Traceback (most recent call last):
  File "C:\model\exllamav2\examples\chat.py", line 5, in <module>
    from exllamav2 import(
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\exllamav2\__init__.py", line 3, in <module>
    from exllamav2.model import ExLlamaV2
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\exllamav2\model.py", line 25, in <module>
    from exllamav2.config import ExLlamaV2Config
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\exllamav2\config.py", line 2, in <module>
    from exllamav2.fasttensors import STFile
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\exllamav2\fasttensors.py", line 5, in <module>
    from exllamav2.ext import exllamav2_ext as ext_c
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\exllamav2\ext.py", line 2, in <module>
    from torch.utils.cpp_extension import load
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 205, in <module>
    HIP_HOME = _join_rocm_home('hip') if ROCM_HOME else None
               ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sora\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\cpp_extension.py", line 158, in _join_rocm_home
    raise OSError('Building PyTorch extensions using '
OSError: Building PyTorch extensions using ROCm and Windows is not supported.

What should I do to forcing exllama2 to use cuda instead of detecting ROCM?

turboderp commented 2 weeks ago

Sorry I skipped over this, apparently? My guess would be you have the ROCm version of PyTorch installed. Solution would be a venv with the CUDA version of PyTorch, probably.

sorasoras commented 1 week ago

Sorry I skipped over this, apparently? My guess would be you have the ROCm version of PyTorch installed. Solution would be a venv with the CUDA version of PyTorch, probably.

I have Rocm Windows installed but not the pytorch for windows Rocm since that does not exist yet.

I think that's the problem is that it keep finding rocm of windows and skip pytorch by cuda. how do I block exllama from finding rocm and just use cuda?