turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.58k stars 278 forks source link

Windows prebuilt install: DLL load failed while importing exllamav2_ext #118

Closed mxbi closed 9 months ago

mxbi commented 12 months ago

Hi there,

Library works great on Linux. On Windows, I have CUDA runtime 11.8 and Python 3.10 in my conda env, so I install the relevant pre-built binary:

pip install https://github.com/turboderp/exllamav2/releases/download/v0.0.6/exllamav2-0.0.6+cu118-cp310-cp310-win_amd64.whl

However, when I import the library, I get:

Python 3.10.13 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:24:38) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import exllamav2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\mikel\anaconda3\envs\sniper\lib\site-packages\exllamav2\__init__.py", line 3, in <module>
    from exllamav2.model import ExLlamaV2
  File "C:\Users\mikel\anaconda3\envs\sniper\lib\site-packages\exllamav2\model.py", line 11, in <module>
    from exllamav2.cache import ExLlamaV2Cache
  File "C:\Users\mikel\anaconda3\envs\sniper\lib\site-packages\exllamav2\cache.py", line 2, in <module>
    from exllamav2.ext import exllamav2_ext as ext_c
  File "C:\Users\mikel\anaconda3\envs\sniper\lib\site-packages\exllamav2\ext.py", line 14, in <module>
    import exllamav2_ext
ImportError: DLL load failed while importing exllamav2_ext: The specified procedure could not be found.

Any ideas what this could be? Thanks a lot!

Zuellni commented 12 months ago

I've had the same issue on Windows with cp310/cp311-cu118 wheels specifically, both 0.0.5 and 0.0.6, but cu117 and cu121 work fine.

mxbi commented 12 months ago

Good to know @Zuellni. I can confirm that the issue persists even when I install systemwide CUDA 11.8 to make sure the dependencies are linked.

In the end, I compiled from source using the same environment and this worked, but it would be great to fix up the pre-built binaries too if possible!

turboderp commented 12 months ago

What version of PyTorch are you seeing this with?

Zuellni commented 12 months ago

I'm seeing it only with pytorch=2.0.1 personally (both python 3.10 and 3.11), importing ext from exllamav2 results in the same error as above. 2.1.0 works as expected.

turboderp commented 12 months ago

It must be some breaking change in PyTorch 2.1, then, affecting cu118 specifically? Sounds a little odd.

saood06 commented 12 months ago

I can reproduce the issue but with Pytorch 2.1 nightly, swapping from nightly to stable fixed it for me with cu121 and cp310.

Zuellni commented 12 months ago

Nightly has been on 2.2 for a while now, I had an issue with it as well but I assume that's because the wheels are built with 2.1/2.0.

ParisNeo commented 11 months ago

Some lollms users reported the same problem here.

turboderp commented 11 months ago

Right. I think for the next release I'll bump the requirement to PyTorch 2.1 stable, but that's really about the only thing I have to go on here, since I can't reproduce the issue myself.

ParisNeo commented 11 months ago

It is awkward. In my home PC it works fine. On the PCs of two friends, it doesn't work and they get the DLL problem. I don't really know what's wrong. Python is not really that portable after all, we always have problems when trying to build something that runs on all systems.

KamilSucharski commented 9 months ago

Same issue. I can't get it to work no matter which installation route I take.

Windows 11 Enterprise Python 3.11 CUDA 12.1

killfrenzy96 commented 9 months ago

I'm also having this issue.

Windows 10 Python 3.11.5 (miniconda3) CUDA 12.1

I am using exllamav2-0.0.11+cu121-cp311-cp311-win_amd64.whl

$ pip list
Package                 Version
----------------------- ------------
aiohttp                 3.8.6
aiosignal               1.3.1
archspec                0.2.1
async-timeout           4.0.3
attrs                   23.1.0
boltons                 23.0.0
Brotli                  1.0.9
certifi                 2023.11.17
cffi                    1.16.0
charset-normalizer      2.0.4
colorama                0.4.6
conda                   23.11.0
conda-content-trust     0.2.0
conda-libmamba-solver   23.12.0
conda-package-handling  2.2.0
conda_package_streaming 0.9.0
cramjam                 2.7.0
cryptography            41.0.7
distro                  1.8.0
exllamav2               0.0.11+cu121
fastparquet             2023.10.1
filelock                3.13.1
frozenlist              1.4.1
fsspec                  2023.12.2
idna                    3.4
Jinja2                  3.1.2
jsonpatch               1.32
jsonpointer             2.1
libmambapy              1.5.3
MarkupSafe              2.1.3
menuinst                2.0.1
mpmath                  1.3.0
multidict               6.0.4
networkx                3.2.1
ninja                   1.11.1.1
numpy                   1.26.2
packaging               23.1
pandas                  2.1.4
pip                     23.3.1
platformdirs            3.10.0
pluggy                  1.0.0
py-cord                 2.4.1
pycosat                 0.6.6
pycparser               2.21
Pygments                2.17.2
pyOpenSSL               23.2.0
PySocks                 1.7.1
python-dateutil         2.8.2
pytz                    2023.3.post1
regex                   2023.12.25
requests                2.31.0
ruamel.yaml             0.17.21
safetensors             0.4.1
sentencepiece           0.1.99
setuptools              68.2.2
six                     1.16.0
sseclient-py            1.8.0
sympy                   1.12
timeago                 1.0.16
torch                   2.1.2
tqdm                    4.65.0
truststore              0.8.0
typing_extensions       4.9.0
tzdata                  2023.3
urllib3                 1.26.18
websockets              12.0
wheel                   0.41.2
win-inet-pton           1.1.0
yarl                    1.9.4
zstandard               0.19.0
(base)
Evan@KillFrenzy-Main MINGW64 /c/Programs/exllamav2 (master)
$ python --version
Python 3.11.5
(base)
Evan@KillFrenzy-Main MINGW64 /c/Programs/exllamav2 (master)
$ ./start_conversion.bat

C:\Programs\exllamav2>set CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1'

C:\Programs\exllamav2>python convert.py -i data/Mistral-7B-v0.1 -o data/quant -c data/wikitext-test.parquet -b 6.0
No CUDA runtime is found, using CUDA_HOME=''C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1''
Traceback (most recent call last):
  File "C:\Programs\exllamav2\convert.py", line 1, in <module>
    from exllamav2 import ExLlamaV2, ExLlamaV2Config, ExLlamaV2Tokenizer
  File "C:\Users\Evan\miniconda3\Lib\site-packages\exllamav2\__init__.py", line 3, in <module>
    from exllamav2.model import ExLlamaV2
  File "C:\Users\Evan\miniconda3\Lib\site-packages\exllamav2\model.py", line 17, in <module>
    from exllamav2.cache import ExLlamaV2CacheBase
  File "C:\Users\Evan\miniconda3\Lib\site-packages\exllamav2\cache.py", line 2, in <module>
    from exllamav2.ext import exllamav2_ext as ext_c
  File "C:\Users\Evan\miniconda3\Lib\site-packages\exllamav2\ext.py", line 15, in <module>
    import exllamav2_ext
ImportError: DLL load failed while importing exllamav2_ext: The specified module could not be found.
(base)
Evan@KillFrenzy-Main MINGW64 /c/Programs/exllamav2 (master)
$

Note that everything works correctly on my Linux install. It's only broken on Windows.

turboderp commented 9 months ago

Are you sure this is the CUDA-enabled version of Torch?

torch                   2.1.2
killfrenzy96 commented 9 months ago

It works now. Thank you. That may have been the source of some of the problems people had here. This was what I did:

pip uninstall torch
pip cache purge
pip install torch --index-url https://download.pytorch.org/whl/cu121
KamilSucharski commented 9 months ago

@killfrenzy96 Yes, that fixed it for me as well 😄 Perhaps the installation script should enforce that somehow?

ricklove commented 9 months ago

I had to upgrade torchvision also:

pip uninstall torch torchvision torchaudio
pip cache purge
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
NguyenLamIT commented 9 months ago

@killfrenzy96 Thanks for your solution. It worked for me

Eshita66 commented 8 months ago

19:01:54-799261 ERROR Failed to load the model. Traceback (most recent call last): File "C:\Users\Mst Eshita Khatun\text-generation-webui\modules\ui_model_menu.py", line 213, in load_model_wrapper shared.model, shared.tokenizer = load_model(selected_model, loader) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Mst Eshita Khatun\text-generation-webui\modules\models.py", line 87, in load_model output = load_func_maploader ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Mst Eshita Khatun\text-generation-webui\modules\models.py", line 387, in ExLlamav2_HF_loader from modules.exllamav2_hf import Exllamav2HF File "C:\Users\Mst Eshita Khatun\text-generation-webui\modules\exllamav2_hf.py", line 7, in from exllamav2 import ( File "C:\Conda\envs\eshi\Lib\site-packages\exllamav2__init__.py", line 3, in from exllamav2.model import ExLlamaV2 File "C:\Conda\envs\eshi\Lib\site-packages\exllamav2\model.py", line 16, in from exllamav2.config import ExLlamaV2Config File "C:\Conda\envs\eshi\Lib\site-packages\exllamav2\config.py", line 2, in from exllamav2.fasttensors import STFile File "C:\Conda\envs\eshi\Lib\site-packages\exllamav2\fasttensors.py", line 5, in from exllamav2.ext import exllamav2_ext as ext_c File "C:\Conda\envs\eshi\Lib\site-packages\exllamav2\ext.py", line 15, in import exllamav2_ext ImportError: DLL load failed while importing exllamav2_ext: The specified procedure could not be found. How to solve that? I tried above solution it doesnt work for me

ParisNeo commented 8 months ago

this is back. I don't understand why it keeps coming back! I had the same problem for new users of lollms. They complain having this same error again. It was solved.

If it helps, people who installed old versions then upgraded don't have this problem !