Closed rjmehta1993 closed 1 week ago
I'm also seeing a similar error using the 0.0.19 release.
I am using the following whl: https://github.com/turboderp/exllamav2/releases/download/v0.0.19/exllamav2-0.0.19+cu118-cp310-cp310-linux_x86_64.whl
This was working fine last week and today when I rebuilt my container I am having this issue.
ImportError: /usr/local/lib/python3.10/dist-packages/exllamav2_ext.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi
The reason for this particular kind of error is a version mismatch between your PyTorch binaries and the ExLlamaV2 binaries. For some reason PyTorch breaks the extension API with every new release, and I haven't found a good way to keep track of it all.
But basically, if you're building from source or using the JIT mode, everything should work regardless of version. For the prebuilt wheels you want torch==2.3.0 for exllamav2==0.0.20 or torch 2.2.0 for exllamav2==0.0.19.
You may want to use the --force-reinstall option when installing PyTorch, since it looks like some dependencies might not get fully resolved otherwise, and make sure you also keep the torchvision
and torchaudio
packages in sync, even though they aren't used here.
Thanks that makes sense. Appreciate the guidance
ImportError: /home/ec2-user/.cache/torch_extensions/py310_cu121/exllamav2_ext/exllamav2_ext.so: undefined symbol: _ZN3c104cuda14ExchangeDeviceEa
Getting this error torch is using same cuda as pytorch.
nvcc -V and pytorch cuda are matching