deepseek-ai / DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
MIT License
3.59k stars 150 forks source link

How to fine-tune deepseek v2 models? #40

Open satheeshkatipomu opened 6 months ago

satheeshkatipomu commented 6 months ago

Hi,

Can you please give us instructions about fine-tuning deepseekv2 model? Can we use finetune.py script from DeepSeek-MoE https://github.com/deepseek-ai/DeepSeek-MoE/blob/main/finetune/finetune.py

luofuli commented 5 months ago

No plans...

yiyepiaoling0715 commented 4 months ago

No plans...

why,we have the needs

TMelikhov commented 3 months ago

We are also very interested to do this. So far our experiments have been unsuccessful. It would be incredible if we could get some clues.

yiyepiaoling0715 commented 3 months ago

We are also very interested to do this. So far our experiments have been unsuccessful. It would be incredible if we could get some clues.

what is the problem? i have use some repo like MFTCoder/Firefly to train it successful except for the eval during training; and loss decrease normal. These training pipeline will be ok?

yiyepiaoling0715 commented 3 months ago

We are also very interested to do this. So far our experiments have been unsuccessful. It would be incredible if we could get some clues. I have record the bugs during match deepseek-v2 to training repos can we have a wechat to communicate these cases? my : yiyepiaoling0715

yiyepiaoling0715 commented 3 months ago

like this image