deepseek-ai / DeepSeek-Math

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
MIT License
821 stars 51 forks source link

My environment is something wrong with flash-atten, can I drop it when finetune DeepSeek-Math? #28

Open AceCHQ opened 3 months ago

AceCHQ commented 3 months ago

Hello, there is something wrong with flash-attn, can I drop it when I finetune DeepSeek-Math? Will it destroy the performance of the model? Thank you.