turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.45k stars 257 forks source link

No module named 'exllamav2_ext' when loading a model #159

Closed ParisNeo closed 4 months ago

ParisNeo commented 9 months ago

Hi, when I load a model after installing exllamav2, I get this error:

File "c:\Users\sa226037\ai\lollms-ui\lollms-webui\env\lib\site-packages\exllamav2\ext.py", line 14, in import exllamav2_ext ModuleNotFoundError: No module named 'exllamav2_ext'

tutu329 commented 9 months ago

Hi, when I load a model after installing exllamav2, I get this error:

File "c:\Users\sa226037\ai\lollms-ui\lollms-webui\env\lib\site-packages\exllamav2\ext.py", line 14, in import exllamav2_ext ModuleNotFoundError: No module named 'exllamav2_ext'

maybe you shall compile exllamav2 from source

ParisNeo commented 9 months ago

Hi, If I rebuild the wheel, can I share it on my hugging face and automate the access to it using my installer?

ParisNeo commented 9 months ago

I compiled it and it runs fantastically fast. Can I share it with my users? Or should I let you compile a 12.1 version and add it to prebuilt wheels? I just want to ask permission before doing it.

tutu329 commented 9 months ago

I compiled it and it runs fantastically fast. Can I share it with my users? Or should I let you compile a 12.1 version and add it to prebuilt wheels? I just want to ask permission before doing it.

you can ask turboderp ^ ^

turboderp commented 9 months ago

There's nothing stopping you from distributing a prebuilt wheel to your users. But it's the multitude of different setups that makes it tricky, so are you sure your wheel will work for your users as well?

Of course ideally I'd want to figure out what's wrong with the prebuilt 12.1 wheels in the releases...

ParisNeo commented 9 months ago

Thanks alot. I'm going to do a little hack in my code that tests your wheel first and if it fails, I'll make it try mine :)

turboderp commented 9 months ago

I've just done some more digging, and are you on PyTorch 2.0.1 by any chance? Because it does appear that something about how Torch handles extensions has changed between 2.0.1 and 2.1.x. Maybe just for the Windows version?

In any case I was getting errors as well for PyTorch 2.0.1 on this here Windows PC, and they went away when I upgraded to 2.1.1. The wheels are built using 2.1.0.

ParisNeo commented 9 months ago

Hi. Now I moved everything to the newest pytorch. Everything is working perfectly.

adityamiskin commented 8 months ago

It doesnt work for me sadly. Below are my specifications. Have I made some mistake?

image

image

image

ParisNeo commented 8 months ago

I moved to cuda 12.1

turboderp commented 8 months ago

@adityamiskin You have CUDA toolkit 11.5 installed. Your GPU+driver will support up to 12.2, so you should be able to use the prebuilt wheel for cu118 matching your Torch version. Otherwise you'll have to upgrade the CUDA toolkit to a later version.