Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent

This repo is source code of article: Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent Alt text

Introduction

We propose a formal description of the mathematical solving and extend LLMs with an agent-based zero-shot framework named Planner-Reasoner-Executor-Reflector (PRER). We further provideand implement two MathAgents that define the logical forms and inherent relations via a pool of actions in different grains and orientations: MathAgent-M adapts its actions to LLMs, while MathAgent-H aligns with humankind.

Quick start

Change the value of OPENAI_API_KEY in file .env by your own key.
Run test.ipynb

Test on MATH or miniF2F

Clone this repo and install dependency by
```
pip install -r requiresments.txt
```

download miniF2F(informal) and MATH for future test

cd data

a. miniF2F:

git clone https://github.com/facebookresearch/miniF2F.git

b. MATH

wget https://people.eecs.berkeley.edu/~hendrycks/MATH.tar
tar -xvf MATH.tar

Before test, conform your directory structure is like:

data
└--MATH
   └--test
   └--train
   ...
└--miniF2F
   └--informal
      └--test
      └--valid
   ...
src
main.py
readme.md
requiresments.txt
test.ipynb

Run MathAgent on these two dataset and store results.

python main.py \
--dataset "MATH"\
--split "test"\
--topic "algebra"

Results

The following table shows the results of our tests on the MATH dataset. More detailed results and analysis will be found in papper.

Method	Alg	Prob	Geo	InterAlg	NumTh	PreAlg	Precal	overall
WizardMath	33.3	17.3	15.7	7.1	16.3	41.7	12.6	22.7
MAmmoTH	-	-	-	-	-	-	-	46.8
CR*(k=4)	79.3	57.9	39.0	28.9	54.8	71.8	30.4	54.20
GPT4+CCoT(k=8)	70.8	53.1	36.5	23.4	49.6	71.6	26.7	50.36
GPT4+PHP(k=8)	74.3	56.3	41.9	26.3	55.7	73.8	29.8	53.90
GPT4	66.3	53.5	41.5	23.7	43.0	74.5	29.7	49.76
MathAgent-M(ours)	64.3	54.6	44.1	27.2	45.4	74.4	31.5	50.88
-2.0	+1.9	+2.6	+3.5	+2.4	-0.1	+1.8	+1.12
Math Agent-H(ours)	76.0	62.0	47.6	31.0	59.1	83.5	36.8	59.02
+9.7	+8.5	+6.1	+7.3	+16.1	+9.0	+7.1	+9.26

Citation

If you find this useful in your research, please cite as:

@article{PRER2023,
  title={Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent},
  author={Haoran Liao and Qinyi Du and Shaohua Hu and Hao HE and Yanyan Xu and Jidong Tian and Yaohui Jin},
  journal={Arxiv},
  year={2023}
}

oashua / MathAgent

readme