DLL load failed while importing flash_attn_2_cuda: The specified module could not be found.

XeonG commented 1 year ago

Describe the bug

bin C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cuda118.dll 2023-10-21 18:49:25 INFO:Loading the extension "gallery"... Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\gradio\components\dropdown.py:231: UserWarning: The value passed into gr.Dropdown() is not in the list of choices. Please update the list of choices to include: llama or set allow_custom_value=True. warnings.warn( 2023-10-21 18:49:58 INFO:Loading vicuna-33B-coder-AWQ... 2023-10-21 18:49:58 ERROR:Failed to load the model. Traceback (most recent call last): File "C:\Projects\AI\text-generation-webui\modules\ui_model_menu.py", line 201, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name, loader) File "C:\Projects\AI\text-generation-webui\modules\models.py", line 79, in load_model output = load_func_maploader File "C:\Projects\AI\text-generation-webui\modules\models.py", line 291, in AutoAWQ_loader from awq import AutoAWQForCausalLM File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\awq__init.py", line 2, in from awq.models.auto import AutoAWQForCausalLM File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\awq\models__init__.py", line 1, in from .mpt import MptAWQForCausalLM File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\awq\models\mpt.py", line 1, in from .base import BaseAWQForCausalLM File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\awq\models\base.py", line 11, in from awq.quantize.quantizer import AwqQuantizer File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\awq\quantize\quantizer.py", line 10, in from awq.quantize.scale import apply_scale, apply_clip File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\awq\quantize\scale.py", line 7, in from transformers.models.llama.modeling_llama import LlamaRMSNorm File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 45, in from flash_attn import flash_attn_func, flash_attn_varlen_func File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\flash_attn\init__.py", line 3, in from flash_attn.flash_attn_interface import ( File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\flash_attn\flash_attn_interface.py", line 8, in import flash_attn_2_cuda as flash_attn_cuda ImportError: DLL load failed while importing flash_attn_2_cuda: The specified module could not be found.

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

start windows bat, load AWQ model

Screenshot

No response

Logs

bin C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cuda118.dll
2023-10-21 18:49:25 INFO:Loading the extension "gallery"...
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\gradio\components\dropdown.py:231: UserWarning: The value passed into gr.Dropdown() is not in the list of choices. Please update the list of choices to include: llama or set allow_custom_value=True.
  warnings.warn(
2023-10-21 18:49:58 INFO:Loading vicuna-33B-coder-AWQ...
2023-10-21 18:49:58 ERROR:Failed to load the model.
Traceback (most recent call last):
  File "C:\Projects\AI\text-generation-webui\modules\ui_model_menu.py", line 201, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(shared.model_name, loader)
  File "C:\Projects\AI\text-generation-webui\modules\models.py", line 79, in load_model
    output = load_func_map[loader](model_name)
  File "C:\Projects\AI\text-generation-webui\modules\models.py", line 291, in AutoAWQ_loader
    from awq import AutoAWQForCausalLM
  File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\awq\__init__.py", line 2, in <module>
    from awq.models.auto import AutoAWQForCausalLM
  File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\awq\models\__init__.py", line 1, in <module>
    from .mpt import MptAWQForCausalLM
  File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\awq\models\mpt.py", line 1, in <module>
    from .base import BaseAWQForCausalLM
  File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\awq\models\base.py", line 11, in <module>
    from awq.quantize.quantizer import AwqQuantizer
  File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\awq\quantize\quantizer.py", line 10, in <module>
    from awq.quantize.scale import apply_scale, apply_clip
  File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\awq\quantize\scale.py", line 7, in <module>
    from transformers.models.llama.modeling_llama import LlamaRMSNorm
  File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 45, in <module>
    from flash_attn import flash_attn_func, flash_attn_varlen_func
  File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\flash_attn\__init__.py", line 3, in <module>
    from flash_attn.flash_attn_interface import (
  File "C:\Projects\AI\text-generation-webui\installer_files\env\lib\site-packages\flash_attn\flash_attn_interface.py", line 8, in <module>
    import flash_attn_2_cuda as flash_attn_cuda
ImportError: DLL load failed while importing flash_attn_2_cuda: The specified module could not be found.

System Info

4090
win10
80gb

LucianoSP commented 1 year ago

Same problem here RTX 4090 Win 11 64gb

chielteuben commented 1 year ago

Duplicate of #4342

Starlento commented 1 year ago

I have the issue as well. It is recommended to create a new conda environment. You can find that the requirements are cu121 now in README and requirements.txt. After strictly follow the environment creation, I can run successfully.

ayush1268 commented 12 months ago

I have the issue as well. It is recommended to create a new conda environment. You can find that the requirements are cu121 now in README and requirements.txt. After strictly follow the environment creation, I can run successfully.

how to create new conda environment also i can't able to load any model and gives me runtime error and this same error how can i solve it can you watch my error too

https://github.com/oobabooga/text-generation-webui/issues/4357#issue-1955941282

LTSarc commented 12 months ago

So, I did a billion different attempts - including manually changing through different branches of CUDA, and nothing worked.

AutoAWQ is just broke for now. It can't be run on any clean install, and I can't find what exact commit broke it.

nalf3in commented 11 months ago

Can be fixed temporarily by running git checkout 8f6405d2fa1c704edbcd2f4371ac21c3491d162b -- requirements.txt and running docker compose build --no-cache (if using docker)

user177013 commented 11 months ago

Not only the CUDA requirement is bumped to 12.1, the python also need to be updated to 3.11. So here's the solution without needing to do fresh install or creating new conda environment.

First you need to download CUDA 12.1, then in the PATH you need to move it on top of your old CUDA. Screenshot 2023-11-01 183325

Then in command line:

#navigate to your installation path
cd <where you install the repo>

#activate the conda environment and verify the installed CUDA version
conda activate textgen
nvcc --version

the correct version would looks like this:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Wed_Feb__8_05:53:42_Coordinated_Universal_Time_2023
Cuda compilation tools, release 12.1, V12.1.66
Build cuda_12.1.r12.1/compiler.32415258_0

After verifying the CUDA version, upgrade your python to 3.11.

#uninstall old version to avoid conflict
conda uninstall -y python
#install python 3.11
conda install -y python==3.11

Update your pytorch.

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Then force reinstall the requirements to avoid conflict.

pip install -r --force-reinstall requirements.txt

Then start server.py. It might also be useful to set this value in your batch file, but I'm not really sure, only use this if after doing step above and still fail.

set "CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1"
set "CUDA_HOME=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1"

Tybost commented 11 months ago

Having the same issue. AutoAWQ broken.

LTSarc commented 11 months ago

Not only the CUDA requirement is bumped to 12.1, the python also need to be updated to 3.11. So here's the solution without needing to do fresh install or creating new conda environment.

So the thing is, I did a complete reinstall inside of WSL2 (lost a file in the process RIP) at 12.1 & 3.11 among many other attempts to fix the problem (for Linux at least, 12.1/3.11 are the defaults currently anyhow) and it didn't work. It's just borked.

stupidcreature commented 11 months ago

For me it was easy to do a local workaround to just copy cudart64_12.dll (that I had in some other installation folder) to a place it can be found. I placed in in ${WORSPACE}/installer_files/env/bin/ but you can also put it anywhere in your search path I assume.

XeonG commented 11 months ago

aah weeks later.. updated text-gen and copied that cuda file (I had previously installed cuda12.... it all works.. didn't until I copied that file on windows... well so much of the oneclick installer .. at least AWQ is working now thanks

user177013 commented 11 months ago

aah weeks later.. updated text-gen and copied that cuda file (I had previously installed cuda12.... it all works.. didn't until I copied that file on windows... well so much of the oneclick installer .. at least AWQ is working now thanks

There's a reason why manual installation is recommended... the last time I used one click installer, several months ago it brokes my PATH. If you can, always manual install. It's not that hard to do really.

mongolu commented 11 months ago

Even better than 5his is to use one-click in a docker. I'm using it like 5his right now and it's self-updating every time I start the container.

So 🥂 to the devs 🥂, 10x

XeonG commented 11 months ago

I mean I don't really trust anything built on the shit show that is python to consistly work properly at all..

why anyone would build libraries for it, in it and around it it is mystery to me.

though would think with a one click installer it would be maintained by people with the patience to deal with the broken mess that is python and all the crap built around it.. the fact we have to copy a dll from cuda install is crazy, perhaps just the lack of support for windows I dunno.

Is certainly nothing mentioned main page about the one click installer and installing cuda manually and copy a dll for it to work properly.

odragora commented 10 months ago

Same problem.

odragora commented 10 months ago

For me it was easy to do a local workaround to just copy cudart64_12.dll (that I had in some other installation folder) to a place it can be found. I placed in in ${WORSPACE}/installer_files/env/bin/ but you can also put it anywhere in your search path I assume.

Doesn't work for me, unfortunately.

azulika commented 10 months ago

same

baichuanzhou commented 9 months ago

same here

Rrezakk commented 8 months ago

same...

reserved-word commented 8 months ago

The tips here didn't help me, I deleted the installer_files directory, rolled back to the latest snapshot (2024-02-04) and did a clean install, now it works

machntosh commented 8 months ago

Hello, guys ! What is the step by step guide to fix this issue: DLL load failed while importing flash_attn_2_cuda ? 🤔

mintnmint1 commented 8 months ago

uninstalling flash-attn help me. Run the below command.

pip3 uninstall flash-attn

ApiaoSamaa commented 7 months ago

Simply do the uninstall and re-install thing. Below are the instructions, I fixed the same question on my Linux server.

Before I met the problem, I did the following instructions in bash:

conda activate MY_ENV
pip install packaging
pip install ninja # to fasten the speed of downloading flash_attn
pip install flash_attn

Everything went well untill I reinstalled the Pytorch+cuda. And then I saw the error of - load failed while importing flash_attn_2_cuda: The specified module could not be found.

What I do is just the following. (Be aware that you have activated the right conda environment, installed the packaging and nija)

pip uninstall flash_attn
pip install flash_attn

github-actions[bot] commented 5 months ago

This issue has been closed due to inactivity for 2 months. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

oobabooga / text-generation-webui