turboderp / exui

Web UI for ExLlamaV2
MIT License
449 stars 43 forks source link

Windows 11 install procedure and missing dependancies (transformers) #31

Open Anon426 opened 10 months ago

Anon426 commented 10 months ago

Hi all,

Ok i had multiple issues getting this working with the default requirements.txt install and although the additional install info is useful, theres no full guide, so here are the steps that will hopefully get it running for you - im using windows 11 (NOT WSL) and a 4090

1)install python 3.10.x + add to paths option

2) install git

3) install cuda 12.1 + make sure paths are correct

4) install vscode for c++ (unknown if necessary but I already have this for many other ai gens)

5) git clone repo and prepare:

git clone https://github.com/turboderp/exui.git

cd exui

python -m venv venv

call venv/scripts/activate

6) open requirements.txt and delete the torch & exllama2 entries, save

7) install torch for cuda 12.1:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

7a) then install the remaing requirement deps:

pip install - r requirements.txt

8) Download the correct pre-compiled wheels for xxlama2 & flash attention 2

https://github.com/turboderp/exllamav2/releases/download/v0.0.11/exllamav2-0.0.11+cu121-cp310-cp310-win_amd64.whl

https://github.com/jllllll/flash-attention/releases/download/v2.4.2/flash_attn-2.4.2+cu121torch2.1cxx11abiFALSE-cp310-cp310-win_amd64.whl

9) move these wheels to the exui directory then install them

pip install "exllamav2-0.0.11+cu121-cp310-cp310-win_amd64.whl

pip install "flash_attn-2.4.2+cu121torch2.1cxx11abiFALSE-cp310-cp310-win_amd64.whl"

10) install transformers + sentencepiece (sentence maybe not needed but got it anyway) - this is missing from any documentation here at all and youll need these to actually load a model

pip install --no-cache-dir transformers sentencepiece

11) run exui

python server.py

12) goto models and load you model

enjoy

Odin7094 commented 9 months ago

I'm on W10 but still thanks, it finally worked once I went by your instructions.

tamanor commented 6 months ago

I've tried getting Exui to work for the past few days and had nothing but issues, your guide has got me the most into it. but when I run the python server.py command i get the following

(venv) J:\exui>python server.py Traceback (most recent call last): File "J:\exui\server.py", line 11, in <module> from backend.models import update_model, load_models, get_model_info, list_models, remove_model, load_model, unload_model, get_loaded_model File "J:\exui\backend\models.py", line 5, in <module> from exllamav2 import( File "J:\exui\venv\Lib\site-packages\exllamav2\__init__.py", line 3, in <module> from exllamav2.model import ExLlamaV2 File "J:\exui\venv\Lib\site-packages\exllamav2\model.py", line 29, in <module> from exllamav2.attn import ExLlamaV2Attention, has_flash_attn File "J:\exui\venv\Lib\site-packages\exllamav2\attn.py", line 26, in <module> import flash_attn File "J:\exui\venv\Lib\site-packages\flash_attn\__init__.py", line 3, in <module> from flash_attn.flash_attn_interface import ( File "J:\exui\venv\Lib\site-packages\flash_attn\flash_attn_interface.py", line 10, in <module> import flash_attn_2_cuda as flash_attn_cuda ImportError: DLL load failed while importing flash_attn_2_cuda: The specified procedure could not be found.

From what I can see I have everything installed, any idea what im doing wrong?