alibaba / Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Apache License 2.0
707 stars 100 forks source link

Flash-Attn 3的支持 #308

Open echo-valor opened 3 months ago

echo-valor commented 3 months ago

打扰,请问PAI-Megatron何时支持Flash-attn 3?

jerryli1981 commented 2 months ago

您好,在即将推出的基于Mcore的llama3.1中,我们将对Flash Attention 3进行测试并尝试支持