open-compass / T-Eval

[ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step
https://open-compass.github.io/T-Eval/
Apache License 2.0
235 stars 15 forks source link

测试结果不完整 #52

Open Mrak6192 opened 6 months ago

Mrak6192 commented 6 months ago

测试结果都保存了,但是测试的-1.json文件里只有两部分内容,如何解决呢? image

zehuichen123 commented 6 months ago

hmm 自己执行一下evaluation code呢?

Mrak6192 commented 6 months ago

嗯自己执行一下评估代码呢?

找到问题了,评分模型mpnet和gte两个模型当时没连上hf下载下来,重新跑了一下evaluation有结果了,谢谢!