deepseek-ai / DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
MIT License
3.47k stars 143 forks source link

偏好数据构造方法 #31

Closed pandaupc closed 4 months ago

pandaupc commented 4 months ago

论文中提到: We obtain code preference data based on compiler-feedback, and mathematical preference data based on the ground-truth labels 可以详细讲一下是如何做的吗?

luofuli commented 4 months ago

技术报告外的信息暂无披露计划 @pandaupc