BlinkDL / LM-Trick-Questions

Here we collect trick questions and failed tasks for open source LLMs to improve them.
Apache License 2.0
32 stars 2 forks source link

Simple math word problems #13

Open Yuzabotov opened 1 year ago

Yuzabotov commented 1 year ago

Summary

It struggles struggles with simple math word problems (that include chain of thought)

Training it on these word problems could help RWKV with multistep reasoning as stated in this article by OpenAI: https://openai.com/research/solving-math-word-problems They also provide their dataset: https://github.com/openai/grade-school-math

(note: Issue created by discord user diabolo96)

Model version:

RWKV-4-Raven-7B-v10-Eng99%-Other1%-20230418-ctx8192

Example 1

Instruction: This is a math word problem. You will solve it and provide a simple step by step verification. Input: Tim grows 5 trees. Each year he collects 6 lemons from each tree. How many lemons does he get in a decade? Output: Let's start by setting up the problem. In a decade, Tim will collect 6 trees * 10 lemons per tree = 60 lemons from each tree. Therefore, in a decade, he will collect 60 * 10 = 600 lemons. The answer: 600.

Expected Output: Tim collects a total of 5*6= 30 lemons per tree. So in a decade, he gets 30*10= 300 lemons.

diabolo98 commented 1 year ago

Update :

It still struggles with simple math word problems even with the new V11 model.

Model version:

RWKV-4-Raven-7B-v11-Eng99%-Other1%-20230427-ctx8192

Example 1

Instruction: ( same as before) This is a math word problem. You will solve it and provide a simple step by step verification.

Input: (same as before) Tim grows 5 trees. Each year he collects 6 lemons from each tree. How many lemons does he get in a decade?

output :

tested using default settings : ( got it correct only once out of 5 tries )

  1. In a decade, Tim will have collected 600 lemons.
  2. In a decade, Tim will have 5 trees x 6 lemons/tree = 30 lemons. In a decade, he will have 30 lemons x 10 years = 300 lemons.
  3. In a decade, Tim will collect 50 lemons from each tree. Therefore, in 10 years he will collect 500 lemons.
  4. In a decade, Tim will collect 600 lemons from the 10 trees. Therefore, in a decade he will collect 60 lemons per tree.
  5. Tim collects 60 lemons in a decade.

tested using top P= 0.2 settings as advised by @BlinkDL : ( all answers are wrong )

  1. In a decade, Tim will collect a total of 60 lemons from his 5 trees.
  2. In a decade, Tim will collect a total of 5 * 10 = 50 lemons.
  3. In a decade, Tim will collect 60 lemons from each tree.
  4. In a decade, Tim will collect a total of 5 * 10 = 50 lemons.
  5. In a decade, Tim will collect a total of 60 lemons from his 5 trees.

NOTE : i tried many more times using top P=0.2 ans it never got it right.