OpenLMLab / LEval

[ACL'24 Oral] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark
GNU General Public License v3.0
314 stars 13 forks source link

topic_retrieval_longchat task eval change the pre to True or False?? #7

Closed DavideHe closed 9 months ago

DavideHe commented 10 months ago

I git clone this project. then run

python Evaluation/auto_eval.py  --pred_file Predictions/exam_eval/turbo-16k-0613/topic_retrieval_longchat.pred.jsonl

I get the result .show pre are changed to "True or False"?

False The effects of air pollution on human health | score=0
.........................
====================
False The role of education in society | score=0
====================
There are 0 correct answers
 [for coursera:] 0 can not select all correct options
 Total: 150 questions.
{'exact_match': 0.0, 'num_predicted': 150, 'mean_prediction_length_characters': 5.0, 'LEval_score': 0.0, 'display_keys': ['exact_match'], 'display': [0.0]}
ChenxinAn-fdu commented 10 months ago

Sorry!!! I will check this immediately !!!

ChenxinAn-fdu commented 10 months ago

Fixed! Thank you for reminding me of the bug 😃!