Open Anon426 opened 10 months ago
I'm on W10 but still thanks, it finally worked once I went by your instructions.
I've tried getting Exui to work for the past few days and had nothing but issues, your guide has got me the most into it. but when I run the python server.py command i get the following
(venv) J:\exui>python server.py Traceback (most recent call last): File "J:\exui\server.py", line 11, in <module> from backend.models import update_model, load_models, get_model_info, list_models, remove_model, load_model, unload_model, get_loaded_model File "J:\exui\backend\models.py", line 5, in <module> from exllamav2 import( File "J:\exui\venv\Lib\site-packages\exllamav2\__init__.py", line 3, in <module> from exllamav2.model import ExLlamaV2 File "J:\exui\venv\Lib\site-packages\exllamav2\model.py", line 29, in <module> from exllamav2.attn import ExLlamaV2Attention, has_flash_attn File "J:\exui\venv\Lib\site-packages\exllamav2\attn.py", line 26, in <module> import flash_attn File "J:\exui\venv\Lib\site-packages\flash_attn\__init__.py", line 3, in <module> from flash_attn.flash_attn_interface import ( File "J:\exui\venv\Lib\site-packages\flash_attn\flash_attn_interface.py", line 10, in <module> import flash_attn_2_cuda as flash_attn_cuda ImportError: DLL load failed while importing flash_attn_2_cuda: The specified procedure could not be found.
From what I can see I have everything installed, any idea what im doing wrong?
Hi all,
Ok i had multiple issues getting this working with the default requirements.txt install and although the additional install info is useful, theres no full guide, so here are the steps that will hopefully get it running for you - im using windows 11 (NOT WSL) and a 4090
1)install python 3.10.x + add to paths option
2) install git
3) install cuda 12.1 + make sure paths are correct
4) install vscode for c++ (unknown if necessary but I already have this for many other ai gens)
5) git clone repo and prepare:
git clone https://github.com/turboderp/exui.git
cd exui
python -m venv venv
call venv/scripts/activate
6) open requirements.txt and delete the torch & exllama2 entries, save
7) install torch for cuda 12.1:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
7a) then install the remaing requirement deps:
pip install - r requirements.txt
8) Download the correct pre-compiled wheels for xxlama2 & flash attention 2
https://github.com/turboderp/exllamav2/releases/download/v0.0.11/exllamav2-0.0.11+cu121-cp310-cp310-win_amd64.whl
https://github.com/jllllll/flash-attention/releases/download/v2.4.2/flash_attn-2.4.2+cu121torch2.1cxx11abiFALSE-cp310-cp310-win_amd64.whl
9) move these wheels to the exui directory then install them
pip install "exllamav2-0.0.11+cu121-cp310-cp310-win_amd64.whl
pip install "flash_attn-2.4.2+cu121torch2.1cxx11abiFALSE-cp310-cp310-win_amd64.whl"
10) install transformers + sentencepiece (sentence maybe not needed but got it anyway) - this is missing from any documentation here at all and youll need these to actually load a model
pip install --no-cache-dir transformers sentencepiece
11) run exui
python server.py
12) goto models and load you model
enjoy