Closed Vuizur closed 3 months ago
It looks like Colab now has flash-attn preinstalled, but it's the version compiled for Torch 2.1.0, and it doesn't get updated when the requirements.txt installs torch>=2.2.0. I've updated the notebook so it should work again.
When executing the chat_example.ipynb in Google colab (https://colab.research.google.com/github/turboderp/exllamav2/blob/master/examples/chat_example.ipynb), I get errors. I execute the (optional) flash attention cell, and then execute the cell where the exllama requirements get installed. Pip warns about the following:
(I think this might be caused by incompatible torch versions installed by flash-attn, but no idea.)
When executing the last cell, it fails with:
This might be caused by the previous errors.
(Google colab assigned me a T4, and not the V100 (?) selected by default.)
Thanks a lot for maintaining this project!