deepseek-ai / DeepSeek-Coder

DeepSeek Coder: Let the Code Write Itself
https://coder.deepseek.com/
MIT License
6.6k stars 461 forks source link

Are NTP and FIM 2 separate stages of training, or are they combined? #146

Closed Calvinnncy97 closed 6 months ago

Calvinnncy97 commented 6 months ago

After reading through the issues, I understand that repo level concatenation is done for NTP. My question is are these 2 separate stages of training or are they combined?

Thank you.

pkuzqh commented 6 months ago

Combined.

Calvinnncy97 commented 6 months ago

If that's the case, does FIM segmentation also happen to repo concatenated documents?

pkuzqh commented 6 months ago

Yes.

Calvinnncy97 commented 6 months ago

Thank you!