Open huaxiaohua opened 2 months ago
Hi! Thanks for your question! We used our internal eval implementation to generate those metrics instead of relying on the public lm_evaluation_harness library. Here is a summary of our eval details and we also published the evaluation result details as datasets in the Llama 3.1 Evals Hugging Face collections for you to review.
Hello, I would like to run Meta-Llama-3.1-70B-Instruct on the MATH TEST set. How should I set the system prompt and decoding hyperparameters? Use fewshot or zeroshot?