hkust-nlp / dart-math

[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
https://hkust-nlp.github.io/dart-math/
MIT License
65 stars 3 forks source link

Change batch_size from 64 to 32, the report results are not in line with the paper report #4

Open xiaosongyuan opened 1 month ago

xiaosongyuan commented 1 month ago

I run the train-single-node.sh in the scripts to SFT the LLaMA-3-8B base model on the provided dart-math-hard and dart-math-uniform data, with only one modification: change the batch_size from 64 to 32 (due to the CUDA memory), the test accuracy of these 2 SFT models on GSM8k are 0.4185 (hard) and 0.4617 (uniform).

The primary packages of my environment are: torch 2.0.1 transformers 4.42