This repo is source code of article: Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent
We propose a formal description of the mathematical solving and extend LLMs with an agent-based zero-shot framework named Planner-Reasoner-Executor-Reflector (PRER). We further provideand implement two MathAgents that define the logical forms and inherent relations via a pool of actions in different grains and orientations: MathAgent-M adapts its actions to LLMs, while MathAgent-H aligns with humankind.
.env
by your own key.test.ipynb
pip install -r requiresments.txt
cd data
a. miniF2F:
git clone https://github.com/facebookresearch/miniF2F.git
b. MATH
wget https://people.eecs.berkeley.edu/~hendrycks/MATH.tar
tar -xvf MATH.tar
data
└--MATH
└--test
└--train
...
└--miniF2F
└--informal
└--test
└--valid
...
src
main.py
readme.md
requiresments.txt
test.ipynb
python main.py \
--dataset "MATH"\
--split "test"\
--topic "algebra"
The following table shows the results of our tests on the MATH dataset. More detailed results and analysis will be found in papper.
Method | Alg | Prob | Geo | InterAlg | NumTh | PreAlg | Precal | overall |
---|---|---|---|---|---|---|---|---|
WizardMath | 33.3 | 17.3 | 15.7 | 7.1 | 16.3 | 41.7 | 12.6 | 22.7 |
MAmmoTH | - | - | - | - | - | - | - | 46.8 |
CR*(k=4) | 79.3 | 57.9 | 39.0 | 28.9 | 54.8 | 71.8 | 30.4 | 54.20 |
GPT4+CCoT(k=8) | 70.8 | 53.1 | 36.5 | 23.4 | 49.6 | 71.6 | 26.7 | 50.36 |
GPT4+PHP(k=8) | 74.3 | 56.3 | 41.9 | 26.3 | 55.7 | 73.8 | 29.8 | 53.90 |
GPT4 | 66.3 | 53.5 | 41.5 | 23.7 | 43.0 | 74.5 | 29.7 | 49.76 |
MathAgent-M(ours) | 64.3 | 54.6 | 44.1 | 27.2 | 45.4 | 74.4 | 31.5 | 50.88 |
-2.0 | +1.9 | +2.6 | +3.5 | +2.4 | -0.1 | +1.8 | +1.12 | |
Math Agent-H(ours) | 76.0 | 62.0 | 47.6 | 31.0 | 59.1 | 83.5 | 36.8 | 59.02 |
+9.7 | +8.5 | +6.1 | +7.3 | +16.1 | +9.0 | +7.1 | +9.26 |
If you find this useful in your research, please cite as:
@article{PRER2023,
title={Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent},
author={Haoran Liao and Qinyi Du and Shaohua Hu and Hao HE and Yanyan Xu and Jidong Tian and Yaohui Jin},
journal={Arxiv},
year={2023}
}