deepseek-ai / DeepSeek-Math

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
MIT License
783 stars 46 forks source link

[Question] SFT Data Curation #10

Closed choco9966 closed 6 months ago

choco9966 commented 6 months ago

Hi. Thank you very much for the excellent paper. I was impressed throughout the reading of the paper, which was much higher performance and detailed description.

I have one question that I didn't understand while reading. In the 3.1. SFT Data Curation of English mathematical datasets:

problems are paired with solutions in chain-of-thought (CoT) (Wei et al., 2022), program-of-thought (PoT) (Chen et al., 2022; Gao et al., 2023), and tool-integrated reasoning format (Gou et al., 2023)

In the text, it was said to be composed of paired, but is it like case 1 below, which wrote all 3 thought in the one answer? Or, is it like case 2 below, which put 3 thought per question in the form of a question + each thought? It is possible, can you show me an example file for one question?

case1) 
Question : what is 1+1 ? 
Answer : CoT + PoT + Tool-integrated reasoning format 
ex) 1 + 1 = 2 / print(1+1) / ... 

case 2) 
Question : what is 1+1 ? 
Answer : CoT
ex) 1 + 1 = 2

Question 
Answer : PoT
ex) print(1+1) 

Question 
Answer : Tool-integrated reasoning format 

Thank you again for the great paper.

ZhihongShao commented 6 months ago

Thanks for your question! It's more like case 2 but definitely not case 1.