THUDM / AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
https://llmbench.ai
Apache License 2.0
2.15k stars 150 forks source link

Traces of different evaluations #36

Closed Andrewzh112 closed 11 months ago

Andrewzh112 commented 1 year ago

Is it possible to provide the trajectory traces of different evaulations?

zhc7 commented 1 year ago

Are you looking for runs.jsonl in the result directory?

Andrewzh112 commented 1 year ago

Are you looking for runs.jsonl in the result directory?

Yes, where is it located? I couldn't find it. Thanks

zhc7 commented 1 year ago

You can run a evaluation following docs/tutorial.md, and then you'll see the output dir.

Xiao9905 commented 11 months ago

@Andrewzh112 Hi, thanks for your interest in AgentBench. Has your problem been solved successfully? Feel free to reopen the issue if you need more help.