Open LZY-the-boys opened 1 month ago
This is the result I obtained after cloning and reproducing (using seed 0), and everything seems normal. Could you provide more details on how you reproduced the results to help identify your issue?
Evaluate seed 0 results:
Test samples: 1319 Test samples: 1319 Final Accuracy: 54.13191811978771
I want to make sure that "./models/llama-2/llama-2-7b" is equal to 'meta-llama/Llama-2-7b-hf', and load_dataset("./data/gsm8k/main", split="test")
is equal to load_dataset("openai/gsm8k", 'main')['test']
,
and load_dataset("./data/MetaMathQA", split="train")
is equal to load_dataset("meta-math/MetaMathQA", split="train")
? Cause these are the only changes I made to commit hash dd49ab67b0fe7f539969536a39abe8f4b14536bc
.
We discovered a few bugs in the data processing code, which have now been fixed. We're using an earlier version of the meta-llama/Llama-2-7b-hf model (hash: 637a748546bb9abca62b0684183cc362bc1ece6d). You can download it via transformers.LlamaForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf", revision="637a748546bb9abca62b0684183cc362bc1ece6d"). We'll release an updated version of the code soon!
Any updates?
Hi, thanks for your awesome work. I am very interested in your work but facing problems when reproducing the gsm8k result. I keep github code unchanged and run original shell script , and get:
It seems the accuracy is only 15.16, However, the original result in paper is 54.23 to 56.48. I don't know what is wrong.
The training script:
use 8 * A100 , model is 'meta-llama/Llama-2-7b-hf', data is loaded by
load_dataset("meta-math/MetaMathQA", split="train")
The test script:
use the seed 0 trained model, test data is loaded by
dataset = load_dataset("openai/gsm8k", 'main')['test']