Open nezhazheng opened 1 year ago
请问7B没有用上FlashAttention吗?看了下7B代码,没发现这块的逻辑。
No. We user xformers for training, and naive impl for inference.
Required prerequisites
Questions
请问7B没有用上FlashAttention吗?看了下7B代码,没发现这块的逻辑。
Checklist