turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.19k stars 234 forks source link

Update from 0.0.19 to 0.0.20 with Python 3.11, torch 2.2.1 and CUDA 12.1: DLL load failed while importing exllamav2_ext: The specified procedure could not be found. #434

Closed acidbubbles closed 2 weeks ago

acidbubbles commented 2 months ago

I tried a clean install and still got this error:

 DLL load failed while importing exllamav2_ext: The specified procedure could not be found.
  File "python-3.11.8-amd64\Lib\site-packages\exllamav2\ext.py", line 19, in <module>
    import exllamav2_ext
  File "python-3.11.8-amd64\Lib\site-packages\exllamav2\fasttensors.py", line 6, in <module>
    from exllamav2.ext import exllamav2_ext as ext_c
  File "python-3.11.8-amd64\Lib\site-packages\exllamav2\config.py", line 3, in <module>
    from exllamav2.fasttensors import STFile
  File "python-3.11.8-amd64\Lib\site-packages\exllamav2\model.py", line 23, in <module>
    from exllamav2.config import ExLlamaV2Config
  File "python-3.11.8-amd64\Lib\site-packages\exllamav2\__init__.py", line 3, in <module>

I can see in the release notes it says "Wheels compiled for PyTorch 2.3.0": https://github.com/turboderp/exllamav2/releases but the requirements.txt says >=torch 2.2.0: https://github.com/turboderp/exllamav2/blob/master/requirements.txt.

Note that I am running on Windows, and with PythonNet (although I don't think it would be related, since it runs python311.dll anyway)

It worked with version 0.0.11, 0.0.12, 0.0.13.post1 and post2, 0.0.14, 0.0.15 and 0.0.19.

Let me know if there's some additional information I can provide. Upgrading PyTorch may be feasible, but since I also depend on multiple other packages it might wreak havoc on the dependencies tree :)

Note: The error seems to be the same as https://github.com/turboderp/exllamav2/issues/118 - which may point to a problem with torch 2.2.x support.

turboderp commented 2 months ago

The reason I compiled against Torch 2.3 this time was that many people have upgraded and can no longer use wheels compiled against 2.2. I don't really know what to do about it unless someone can convince the PyTorch people to stop breaking the extension API/ABI with every new release.

You don't need 2.3 to run ExLlamaV2 if you're not installing it with a prebuilt wheel. In fact it likely works with Torch 2.0 and maybe 1.x as well. But I've mostly tested on 2.2 so I set that as a minimum req out of caution. Sadly I don't know of a good way to release wheels with individual requirements for each.

An option could be to add another dimension to the build matrix and have four times as many release versions to support the four most recent PyTorch versions, but I'm not sure that's a great road to go down either. :shrug:

But anyway, yes, you need PyTorch 2.3 to use the prebuilt wheels for v0.0.20.

acidbubbles commented 2 months ago

Got it, thanks for the quick answer. I'll see what breaks if I update to torch 2.3 and worst case, wait for everyone to catch up. Thanks again for the great lib, I appreciate the work you do!

acidbubbles commented 2 months ago

Well, I'm not out of the woods yet. With Torch 2.3.0:

[WinError 126] The specified module could not be found. Error loading "...\python-3.11.9-amd64\Lib\site-packages\torch\lib\shm.dll" or one of its dependencies.

The DLL file is there, but this might be on my end. I'll update this ticket when I figured it out for others :)

Update: Maybe this is an issue with torch 2.3.0 on Windows. https://github.com/pytorch/pytorch/issues/125109

turboderp commented 2 months ago

It's possible installing the CUDA toolkit solves it? The wheels are built with Torch 2.3.0 on a Windows instance, after all.

acidbubbles commented 2 months ago

That's what some people have been saying on StackOverflow (see the pytorch issue, it was marked as high priority so we should know for sure soon). If it's a new requirement for 2.3 so be it, but since it wasn't needed before I'm guessing we might just need an additional pip dependency or some additional files will be packaged with pytorch itself. We'll see.

IdiotSandwichTheThird commented 1 month ago

The reason I compiled against Torch 2.3 this time was that many people have upgraded and can no longer use wheels compiled against 2.2. I don't really know what to do about it unless someone can convince the PyTorch people to stop breaking the extension API/ABI with every new release.

You don't need 2.3 to run ExLlamaV2 if you're not installing it with a prebuilt wheel. In fact it likely works with Torch 2.0 and maybe 1.x as well. But I've mostly tested on 2.2 so I set that as a minimum req out of caution. Sadly I don't know of a good way to release wheels with individual requirements for each.

An option could be to add another dimension to the build matrix and have four times as many release versions to support the four most recent PyTorch versions, but I'm not sure that's a great road to go down either. 🤷

But anyway, yes, you need PyTorch 2.3 to use the prebuilt wheels for v0.0.20.

afaik there aren't even any flash attention wheels for windows+pytorch2.3 yet, it seems really premature to updoot this soon.

acidbubbles commented 1 month ago

At least https://github.com/pytorch/pytorch/issues/125109 was closed, so we can expect the next torch release to fix this part. In the meantime, I guess we'll have to be patient 🤷

turboderp commented 1 month ago

Is there a workaround in the meantime?

acidbubbles commented 1 month ago

I wasn't able to find one, I thought fiddling with the path would work, but it didn't. I did not spend a lot of time on this though.

turboderp commented 1 month ago

I've reworked the build actions and added some Torch 2.2.0 Windows wheels to the release of v0.0.21. Hope that helps.

acidbubbles commented 1 month ago

It seems to be working, thanks a lot, it does help!