Flash Attention 2 安装报错 - Githubissues

THUDM / GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Apache License 2.0

3.28k stars 235 forks source link

Flash Attention 2 安装报错 #235

Closed washgo closed 6 days ago

washgo commented 6 days ago

System Info / 系統信息

NVIDIA-SMI 535.161.08 Driver Version: 535.161.08 CUDA Version: 12.2 python 3.12.4

Who can help? / 谁可以帮助到您？

No response

Information / 问题信息

[ ] The official example scripts / 官方的示例脚本
[ ] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

 raise RuntimeError(
  RuntimeError: FlashAttention is only supported on CUDA 11.6 and above.  Note: make sure nvcc has a supported version by running nvcc -V.

  torch.__version__  = 2.3.1+cu121

  [end of output]

Expected behavior / 期待表现

昨天新加了flash attention2 代码就跑不起来了，安装https://github.com/Dao-AILab/flash-attention 但是报错以上，但显示环境已经是cuda 12了

zRzRzRzRzRzRzR commented 6 days ago

FlashAttention is only supported on CUDA 11.6 and above 你这个要检查nvcc版本呀，不是torch版本，相关安装FlashAttention 的问题建议到FlashAttention 的仓库进行提问