OpenMOSS / CoLLiE

Collaborative Training of Large Language Models in an Efficient Way
https://openlmlab-collie.readthedocs.io
Apache License 2.0
410 stars 58 forks source link

在3090上使用collie微调moss7B,flash_attn报错 #76

Closed ShacklesLay closed 1 year ago

ShacklesLay commented 1 year ago

RuntimeError: FlashAttention backward for head dim > 64 requires A100 or H100 GPUs as the implementation needs a large amount of shared memory.

KaiLv69 commented 1 year ago

你好,moss 7B的head dim是128,flash-attn在3090上不支持,可以设置config.use_flash=False

ZiboZ commented 1 year ago

你好 FlashAttention不是根据 L1缓存的大小来划分块大小的嘛 理论上V100应该也可以支持的吧