THUDM / GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Apache License 2.0
3.28k stars 235 forks source link

Flash Attention 2 安装报错 #235

Closed washgo closed 6 days ago

washgo commented 6 days ago

System Info / 系統信息

NVIDIA-SMI 535.161.08 Driver Version: 535.161.08 CUDA Version: 12.2 python 3.12.4

Who can help? / 谁可以帮助到您?

No response

Information / 问题信息

Reproduction / 复现过程

 raise RuntimeError(
  RuntimeError: FlashAttention is only supported on CUDA 11.6 and above.  Note: make sure nvcc has a supported version by running nvcc -V.

  torch.__version__  = 2.3.1+cu121

  [end of output]

Expected behavior / 期待表现

昨天新加了flash attention2 代码就跑不起来了,安装https://github.com/Dao-AILab/flash-attention 但是报错以上,但显示环境已经是cuda 12了

zRzRzRzRzRzRzR commented 6 days ago

FlashAttention is only supported on CUDA 11.6 and above 你这个要检查nvcc版本呀,不是torch版本,相关安装FlashAttention 的问题建议到FlashAttention 的仓库进行提问