GSM8K few-shot sampling from test set?

amazon-science / auto-cot

Official implementation for "Automatic Chain of Thought Prompting in Large Language Models" (stay tuned & more will be updated)

https://arxiv.org/abs/2210.03493

Apache License 2.0

1.46k stars 133 forks source link

GSM8K few-shot sampling from test set? #4

Open GasolSun36 opened 1 year ago

GasolSun36 commented 1 year ago

Excellent work! However, in your paper page 19, Appendix D, you show automatically constructed demonstrations for GSM8K. However, I find that these 8 cases are from test.jsonl but not train.jsonl. Is there a data leak problem?

cooelf commented 11 months ago

Hi, there would not be a data leak problem because no gold label is used. We only collect questions for automatic rationale generation.