Questions about inconsistent results between the paper and the the README table.

lpyhdzx commented 1 year ago

Very good work. I was looking at the paper and noticed more inconsistencies between the results in Table 1 and the fine-tuning results in github, for example, llama-7B has gsm8k of 21.9 in the paper but the result in the link is 30.9. Is this due to the inconsistent setting?

Please forgive me if I misunderstood.

HZQ950419 commented 1 year ago

Hi,

The results in the paper have not been updated yet, and those are our first version results with models training with math_data.json. The math_data.json is our first version of training data. It is a mixture of multiple math datasets with rationale derived from the log of Zero-Shot CoT. In the first version, we used 80% of the original test samples as the training set and 20% as the test set. Moreover, we have collected and updated the math_10k.json, which is from the training set of GSM8K, MAWPS, MAWPS-Single, and AQuA. The test sets of all datasets are the same as the original ones. Thus, the results in the table are all trained with math_10k.json and evaluated on the original test set, e.g., 1319 samples for GSM8K.

Thus, please use the results on GitHub for now, we will update the paper soon. Please let me know if you have further questions!

laihuiyuan commented 1 year ago

Hi! Thanks for the great work! Very curious about how you collect the new training data math_10k.json. Use ChatGPT? Could you provide more information on this?

HZQ950419 commented 1 year ago

Hi! Thanks for the great work! Very curious about how you collect the new training data math_10k.json. Use ChatGPT? Could you provide more information on this?

Hi, the new training data math_10k.json is collected with ChatGPT. If you need, we can upload the data collection code.

laihuiyuan commented 1 year ago

Hi! Thanks for the great work! Very curious about how you collect the new training data math_10k.json. Use ChatGPT? Could you provide more information on this?

Hi, the new training data math_10k.json is collected with ChatGPT. If you need, we can upload the data collection code.

That would be great :)

AGI-Edgerunners / LLM-Adapters

Questions about inconsistent results between the paper and the the README table. #21