SkyworkAI / Skywork

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。
Other
1.21k stars 111 forks source link

does it support FlashAttention-2? #23

Closed ericxsun closed 10 months ago

zhao1iang commented 10 months ago

Yes, of course our model structure has not modified the attention part, so it can use flash attention v2. Because flash attention has stricter requirements on the environment, we did not use flash attention in the default configuration. If you need, you can refer to https://huggingface.co/togethercomputer/LLaMA-2-7B-32K/blob/main/modeling_flash_llama.py to directly adapt flash attention by modifying model_skywork.py, or you can use flash attention by adding patches similar to this project, https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/blob/main/scripts/training/flash_attn_patch.py.