THUDM / AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
https://llmbench.ai
Apache License 2.0
2.15k stars 150 forks source link

CardGame task always runing #19

Closed cicyby closed 11 months ago

cicyby commented 1 year ago

image Verification process stuck here ...

Is there a problem with docker communication?

foamliu commented 1 year ago

I have encountered the same problem here, do we need to configure the environment?

Suggestions:

  1. Introduces how to configure the environment in doc;
  2. Raise timeout & exit when the environment is unavailable.
harshraj172 commented 1 year ago

I had the same error. Tried with workers=16, which worked.

cyente commented 1 year ago

I had the same error. While workers=16, does not work. It seems that it is hang on myAI.run(). Could someobe give a detail example of how the run cardGame?

sjqgogogo commented 1 year ago

I had the same error and I solved this by running on docker. You can use command in ./scripts/eval_utils.sh or create your own assignment by using create_assignmnet.py(https://github.com/THUDM/AgentBench/blob/main/docs/tutorial.md) to run the evaluation on docker.