Closed Ocean-627 closed 4 months ago
Hi! You should follow the chat format of Llama3 (click here)
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
{{ L-Eval system_prompt}}<|eot_id|><|start_header_id|>user<|end_header_id|>
{{L-Eval Long context + Question }}<|eot_id|><|start_header_id|>assistant<|end_header_id|> \nAnswer:
If you still can not reproduce the results please kindly leave a comment.
Thank you for your kind reply!
@ChenxinAn-fdu i can not reproduce llama3-8b result according ur advice, just got {'exact_match': 53.9604, 'num_predicted': 202, 'mean_prediction_length_characters': 1.0, 'LEval_score': 53.9604, 'display_keys': ['exact_match'], 'display': [53.9604]}
here is my codes: python Baselines/llama2-chat-test.py \ --metric exam_eval \ --task_name quality \ --max_length 4k
and change llama2-chat-test.py elif args.metric == "exam_eval": context = "Document is as follows. {document} \nQuestion: {inst}. Please directly give the answer without any additional output or explanation "
message="<|begin_of_text|>"+sys_prompt
message += "\nAnswer:"
Excellent work! I noticed that the README provides results for Llama3-8b. However, I used meta-llama/Meta-Llama-3-8B-Instruct with llama2-chat-test.py and replaced LlamaTokenizer with AutoTokenizer, but I couldn't reproduce the results shown in the table. Could you please provide the reproduction code and commands for achieving the results on Llama3-8b? Thank you very much!