LLMs Are Not Intelligent Thinkers: Introducing Mathematical Topic Tree Benchmark for Comprehensive Evaluation of LLMs
https://arxiv.org/abs/2406.05194
For the dataset, please refer to the Data.jsonl in the folder Data. Each question is in the following format: "Path", "Question", "Choices", "Correct Answer". For example the first question is as follows:
{"Path": ["Mathematics", "Pure Mathematics", "Algebra", "Abstract Algebra", "Group Theory", "Group", "Definitions and group axioms"], "Question": "G= set of all integers, a.b=a-b, H=set of all positive integers, a.b = ab, where ab is the usual product of integers. Which one of G and H form a group?", "Choices": ["Only G", "Only H", "Both of G and H.", "None of them."], "Correct Answer": "D"}