oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.
GNU Affero General Public License v3.0
39.68k stars 5.21k forks source link

CUDA 11.8 installation fails because of missing flash_attention release #5654

Closed bigsatchmo closed 6 months ago

bigsatchmo commented 6 months ago

Describe the bug

When installing on windows with option NVIDIA (A) and choosing CUDA 11.8 Support (Y), the installation is failing because of: ERROR: HTTP error 404 while getting https://github.com/oobabooga/flash-attention/releases/download/v2.5.6/flash_attn-2.5.6+cu118torch2.2.0cxx11abiFALSE-cp311-cp311-win_amd64.whl

As far I can see, there is no existing release for CUDA 11.8 anymore: https://github.com/oobabooga/flash-attention/releases

Is there an existing issue for this?

Reproduction

After cloning main on a Windows System, choosing A for NVIDIA and Y for installing CUDA 11.8 support.

Screenshot

grafik

Logs

ERROR: HTTP error 404 while getting https://github.com/oobabooga/flash-attention/releases/download/v2.5.6/flash_attn-2.5.6+cu118torch2.2.0cxx11abiFALSE-cp311-cp311-win_amd64.whl

System Info

System: Windows 10 Pro
GPU: TESLA K80
CPU: AMD Ryzen 5 5500
RAM: 32GB
oobabooga commented 6 months ago

Fixed in https://github.com/oobabooga/text-generation-webui/commit/bef08129bce8a969582a0132c282ab1eb47cfaa4