Closed mxbi closed 9 months ago
I've had the same issue on Windows with cp310/cp311-cu118 wheels specifically, both 0.0.5 and 0.0.6, but cu117 and cu121 work fine.
Good to know @Zuellni. I can confirm that the issue persists even when I install systemwide CUDA 11.8 to make sure the dependencies are linked.
In the end, I compiled from source using the same environment and this worked, but it would be great to fix up the pre-built binaries too if possible!
What version of PyTorch are you seeing this with?
I'm seeing it only with pytorch=2.0.1 personally (both python 3.10 and 3.11), importing ext from exllamav2 results in the same error as above. 2.1.0 works as expected.
It must be some breaking change in PyTorch 2.1, then, affecting cu118 specifically? Sounds a little odd.
I can reproduce the issue but with Pytorch 2.1 nightly, swapping from nightly to stable fixed it for me with cu121 and cp310.
Nightly has been on 2.2 for a while now, I had an issue with it as well but I assume that's because the wheels are built with 2.1/2.0.
Some lollms users reported the same problem here.
Right. I think for the next release I'll bump the requirement to PyTorch 2.1 stable, but that's really about the only thing I have to go on here, since I can't reproduce the issue myself.
It is awkward. In my home PC it works fine. On the PCs of two friends, it doesn't work and they get the DLL problem. I don't really know what's wrong. Python is not really that portable after all, we always have problems when trying to build something that runs on all systems.
Same issue. I can't get it to work no matter which installation route I take.
Windows 11 Enterprise Python 3.11 CUDA 12.1
I'm also having this issue.
Windows 10 Python 3.11.5 (miniconda3) CUDA 12.1
I am using exllamav2-0.0.11+cu121-cp311-cp311-win_amd64.whl
$ pip list
Package Version
----------------------- ------------
aiohttp 3.8.6
aiosignal 1.3.1
archspec 0.2.1
async-timeout 4.0.3
attrs 23.1.0
boltons 23.0.0
Brotli 1.0.9
certifi 2023.11.17
cffi 1.16.0
charset-normalizer 2.0.4
colorama 0.4.6
conda 23.11.0
conda-content-trust 0.2.0
conda-libmamba-solver 23.12.0
conda-package-handling 2.2.0
conda_package_streaming 0.9.0
cramjam 2.7.0
cryptography 41.0.7
distro 1.8.0
exllamav2 0.0.11+cu121
fastparquet 2023.10.1
filelock 3.13.1
frozenlist 1.4.1
fsspec 2023.12.2
idna 3.4
Jinja2 3.1.2
jsonpatch 1.32
jsonpointer 2.1
libmambapy 1.5.3
MarkupSafe 2.1.3
menuinst 2.0.1
mpmath 1.3.0
multidict 6.0.4
networkx 3.2.1
ninja 1.11.1.1
numpy 1.26.2
packaging 23.1
pandas 2.1.4
pip 23.3.1
platformdirs 3.10.0
pluggy 1.0.0
py-cord 2.4.1
pycosat 0.6.6
pycparser 2.21
Pygments 2.17.2
pyOpenSSL 23.2.0
PySocks 1.7.1
python-dateutil 2.8.2
pytz 2023.3.post1
regex 2023.12.25
requests 2.31.0
ruamel.yaml 0.17.21
safetensors 0.4.1
sentencepiece 0.1.99
setuptools 68.2.2
six 1.16.0
sseclient-py 1.8.0
sympy 1.12
timeago 1.0.16
torch 2.1.2
tqdm 4.65.0
truststore 0.8.0
typing_extensions 4.9.0
tzdata 2023.3
urllib3 1.26.18
websockets 12.0
wheel 0.41.2
win-inet-pton 1.1.0
yarl 1.9.4
zstandard 0.19.0
(base)
Evan@KillFrenzy-Main MINGW64 /c/Programs/exllamav2 (master)
$ python --version
Python 3.11.5
(base)
Evan@KillFrenzy-Main MINGW64 /c/Programs/exllamav2 (master)
$ ./start_conversion.bat
C:\Programs\exllamav2>set CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1'
C:\Programs\exllamav2>python convert.py -i data/Mistral-7B-v0.1 -o data/quant -c data/wikitext-test.parquet -b 6.0
No CUDA runtime is found, using CUDA_HOME=''C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1''
Traceback (most recent call last):
File "C:\Programs\exllamav2\convert.py", line 1, in <module>
from exllamav2 import ExLlamaV2, ExLlamaV2Config, ExLlamaV2Tokenizer
File "C:\Users\Evan\miniconda3\Lib\site-packages\exllamav2\__init__.py", line 3, in <module>
from exllamav2.model import ExLlamaV2
File "C:\Users\Evan\miniconda3\Lib\site-packages\exllamav2\model.py", line 17, in <module>
from exllamav2.cache import ExLlamaV2CacheBase
File "C:\Users\Evan\miniconda3\Lib\site-packages\exllamav2\cache.py", line 2, in <module>
from exllamav2.ext import exllamav2_ext as ext_c
File "C:\Users\Evan\miniconda3\Lib\site-packages\exllamav2\ext.py", line 15, in <module>
import exllamav2_ext
ImportError: DLL load failed while importing exllamav2_ext: The specified module could not be found.
(base)
Evan@KillFrenzy-Main MINGW64 /c/Programs/exllamav2 (master)
$
Note that everything works correctly on my Linux install. It's only broken on Windows.
Are you sure this is the CUDA-enabled version of Torch?
torch 2.1.2
It works now. Thank you. That may have been the source of some of the problems people had here. This was what I did:
pip uninstall torch
pip cache purge
pip install torch --index-url https://download.pytorch.org/whl/cu121
@killfrenzy96 Yes, that fixed it for me as well 😄 Perhaps the installation script should enforce that somehow?
I had to upgrade torchvision also:
pip uninstall torch torchvision torchaudio
pip cache purge
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
@killfrenzy96 Thanks for your solution. It worked for me
19:01:54-799261 ERROR Failed to load the model.
Traceback (most recent call last):
File "C:\Users\Mst Eshita Khatun\text-generation-webui\modules\ui_model_menu.py", line 213, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Mst Eshita Khatun\text-generation-webui\modules\models.py", line 87, in load_model
output = load_func_maploader
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Mst Eshita Khatun\text-generation-webui\modules\models.py", line 387, in ExLlamav2_HF_loader
from modules.exllamav2_hf import Exllamav2HF
File "C:\Users\Mst Eshita Khatun\text-generation-webui\modules\exllamav2_hf.py", line 7, in
this is back. I don't understand why it keeps coming back! I had the same problem for new users of lollms. They complain having this same error again. It was solved.
If it helps, people who installed old versions then upgraded don't have this problem !
Hi there,
Library works great on Linux. On Windows, I have CUDA runtime 11.8 and Python 3.10 in my conda env, so I install the relevant pre-built binary:
However, when I import the library, I get:
Any ideas what this could be? Thanks a lot!