Dao-AILab / flash-attention

Fast and memory-efficient exact attention
BSD 3-Clause "New" or "Revised" License
13.89k stars 1.29k forks source link

ModuleNotFoundError: No module named 'dropout_layer_norm' #615

Open Lvjinhong opened 1 year ago

Lvjinhong commented 1 year ago

I've tried installing flash-attn using pip install flash-attn==2.2.1 and flash-attn==2.3. It can be seen that the installation was ultimately successful. However, when I attempt distributed training with Megatron LM, I consistently encounter the following issue :

image

Additionally, when I tried building from the source code, the issue persisted.

I am using V100,CUDA12.1 , pytorch2.1 , python3.10

davidbuterez commented 1 year ago

I think that you need to install it separately as noted in https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm

cd csrc/layer_norm && pip install .
Husamx commented 1 year ago

Did you manage to get this to work? I read that flashattention doesn't support V100 https://github.com/Dao-AILab/flash-attention/issues/524