Hi. Thank you very much for the excellent paper. I was impressed throughout the reading of the paper, which was much higher performance and detailed description.
I have one question that I didn't understand while reading.
In the 3.1. SFT Data Curation of English mathematical datasets:
problems are paired with solutions in chain-of-thought (CoT) (Wei et al., 2022), program-of-thought (PoT) (Chen et al., 2022; Gao et al., 2023), and tool-integrated reasoning format (Gou et al., 2023)
In the text, it was said to be composed of paired, but is it like case 1 below, which wrote all 3 thought in the one answer? Or, is it like case 2 below, which put 3 thought per question in the form of a question + each thought? It is possible, can you show me an example file for one question?
case1)
Question : what is 1+1 ?
Answer : CoT + PoT + Tool-integrated reasoning format
ex) 1 + 1 = 2 / print(1+1) / ...
case 2)
Question : what is 1+1 ?
Answer : CoT
ex) 1 + 1 = 2
Question
Answer : PoT
ex) print(1+1)
Question
Answer : Tool-integrated reasoning format
Hi. Thank you very much for the excellent paper. I was impressed throughout the reading of the paper, which was much higher performance and detailed description.
I have one question that I didn't understand while reading. In the 3.1. SFT Data Curation of English mathematical datasets:
In the text, it was said to be composed of paired, but is it like case 1 below, which wrote all 3 thought in the one answer? Or, is it like case 2 below, which put 3 thought per question in the form of a question + each thought? It is possible, can you show me an example file for one question?
Thank you again for the great paper.