THUDM / AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
https://llmbench.ai
Apache License 2.0
2.11k stars 145 forks source link

无法正常启动,访问task会报错 #53

Closed Dhaizei closed 8 months ago

Dhaizei commented 10 months ago

INFO: 127.0.0.1:45654 - "GET /api/get_indices?name=dbbench-std HTTP/1.1" 200 OK INFO: 127.0.0.1:45656 - "GET /api/get_indices?name=os-std HTTP/1.1" 400 Bad Request

在python -m src.start_task -a 后(未进行任何改动配置)

<class 'src.server.tasks.os_interaction.task.OSInteraction'> Traceback (most recent call last): File "/root/anaconda3/envs/py38/lib/python3.8/runpy.py", line 192, in _run_module_as_main return _run_code(code, main_globals, None, File "/root/anaconda3/envs/py38/lib/python3.8/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/work/AgentBenchV0.2/src/server/task_worker.py", line 256, in asyncio_task = InstanceFactory.parse_obj(conf[args.name]).create() File "/root/work/AgentBenchV0.2/src/typings/general.py", line 37, in create return getattr(mod, self.module.split(".")[-1])(**self.parameters) File "/root/work/AgentBenchV0.2/src/server/tasks/os_interaction/task.py", line 275, in init

python -m src.assigner 后 访问os-std就会报错

<class 'src.client.task.TaskClient'> TaskClient created: os-std (http://localhost:5000/api) Traceback (most recent call last): File "/root/anaconda3/envs/py38/lib/python3.8/runpy.py", line 192, in _run_module_as_main return _run_code(code, main_globals, None, File "/root/anaconda3/envs/py38/lib/python3.8/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/work/AgentBenchV0.2/src/assigner.py", line 402, in Assigner(value, args.retry).start() File "/root/work/AgentBenchV0.2/src/assigner.py", line 74, in init self.task_indices[task] = self.tasks[task].get_indices() File "/root/work/AgentBenchV0.2/src/client/task.py", line 31, in get_indices raise AgentBenchException(result.text, result.status_code, self.name) src.typings.exception.AgentBenchException: ('{"detail":"Error: Task does not exist"}', 400, 'os-std')

Dhaizei commented 9 months ago

是的,没有错误,还需要更改 configs/start_task.yaml 启动对应的任务 definition: import: tasks/task_assembly.yaml

start: dbbench-std: 1

qinhy14 commented 9 months ago

是的,没有错误,还需要更改 configs/start_task.yaml 启动对应的任务 definition: import: tasks/task_assembly.yaml

start: dbbench-std: 1

嗯嗯,我也是这样搞的。。

definition: import: tasks/task_assembly.yaml

start: alfworld-dev: 1

zhc7 commented 9 months ago

我发现用70b模型进行推理十分缓慢,想着可以将需要测试的数据集分成4份,然后启动4个model-work来进行推理,这样可以增加推理的效率。期待可以将这个功能加上去

Hi @Dhaizei ,目前我们推荐的做法是设一个model worker的转发服务器(如Fastchat中的controller),然后将agent的concurrency设置成4

YinSonglin1997 commented 2 months ago

是的,没有错误,还需要更改 configs/start_task.yaml 启动对应的任务 definition: import: tasks/task_assembly.yaml start: dbbench-std: 1

嗯嗯,我也是这样搞的。。

definition: import: tasks/task_assembly.yaml

start: alfworld-dev: 1

请问您解决了吗?我遇到了和您一样的问题