Unable to reproduce the gsm8k result

LZY-the-boys commented 1 month ago

Hi, thanks for your awesome work. I am very interested in your work but facing problems when reproducing the gsm8k result. I keep github code unchanged and run original shell script , and get:

Evaluate seed 0 results:
Test samples 1319
Test samples 1319
Final Accuracy: 0.1516300227445034

It seems the accuracy is only 15.16, However, the original result in paper is 54.23 to 56.48. I don't know what is wrong.

The training script:

torchrun --nproc_per_node=8 minimal_lora_llama2_math_transformers.py --lora rslora-pro --seed 0

use 8 * A100 , model is 'meta-llama/Llama-2-7b-hf', data is loaded by load_dataset("meta-math/MetaMathQA", split="train")

The test script:

torchrun --nproc_per_node=8 evaluation/eval_llama-2_math_multi_gpus.py

use the seed 0 trained model, test data is loaded by dataset = load_dataset("openai/gsm8k", 'main')['test']

mrflogs commented 1 month ago

This is the result I obtained after cloning and reproducing (using seed 0), and everything seems normal. Could you provide more details on how you reproduced the results to help identify your issue?

Evaluate seed 0 results:

Test samples: 1319 Test samples: 1319 Final Accuracy: 54.13191811978771

LZY-the-boys commented 1 month ago

I want to make sure that "./models/llama-2/llama-2-7b" is equal to 'meta-llama/Llama-2-7b-hf', and load_dataset("./data/gsm8k/main", split="test") is equal to load_dataset("openai/gsm8k", 'main')['test'] , and load_dataset("./data/MetaMathQA", split="train") is equal to load_dataset("meta-math/MetaMathQA", split="train") ? Cause these are the only changes I made to commit hash dd49ab67b0fe7f539969536a39abe8f4b14536bc.

mrflogs commented 1 month ago

We discovered a few bugs in the data processing code, which have now been fixed. We're using an earlier version of the meta-llama/Llama-2-7b-hf model (hash: 637a748546bb9abca62b0684183cc362bc1ece6d). You can download it via transformers.LlamaForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf", revision="637a748546bb9abca62b0684183cc362bc1ece6d"). We'll release an updated version of the code soon!

gongel commented 4 weeks ago

Any updates?

mrflogs / LoRA-Pro

Unable to reproduce the gsm8k result #3