Thanks so much for sharing the findings and insights about "Multi-Choice Question Benchmarks", I have a quick question about the 20 million Chinese MC data leading to overfiting without generalizing to other tasks, are the data composed of questions with pure options OR with sort of explanations in the answers?
Thanks so much for sharing the findings and insights about "Multi-Choice Question Benchmarks", I have a quick question about the 20 million Chinese MC data leading to overfiting without generalizing to other tasks, are the data composed of questions with pure options OR with sort of explanations in the answers?
Thank you again for your great work!