baichuan-inc / Baichuan-7B

A large-scale 7B pretraining language model developed by BaiChuan-Inc.
https://huggingface.co/baichuan-inc/baichuan-7B
Apache License 2.0
5.67k stars 507 forks source link

[Question] 请问7B没有用上FlashAttention吗? #130

Open nezhazheng opened 1 year ago

nezhazheng commented 1 year ago

Required prerequisites

Questions

请问7B没有用上FlashAttention吗?看了下7B代码,没发现这块的逻辑。

Checklist

mmmans commented 1 year ago

No. We user xformers for training, and naive impl for inference.