THUDM / AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
https://llmbench.ai
Apache License 2.0
2.03k stars 138 forks source link

INTERACT_FAILED Error: Session does not exist #70

Closed glad4enkonm closed 8 months ago

glad4enkonm commented 8 months ago

Can not start a KG task with vicuna-7b Getting NTERACT_FAILED Error: Session does not exist What can be the reason?

To Reproduce Steps to reproduce the behavior:

  1. configs/start_task.yaml
    
    definition:
    import: tasks/task_assembly.yaml

start: kg-std: 1


2. configs/assignments/default.yaml

import: definition.yaml

concurrency: task: kg-std: 1 agent: vicuna-7b: 1

assignments: # List[Assignment] | Assignment

output: "outputs/{TIMESTAMP}"

3. Running Vicuna 7b on cpu 8bit, starting the 
4. python -m src.start_task -a
5. python -m src.assigner
6. outputs/2023-11-07-22-36-41/vicuna-7b/kg-std/error.jsonl

{"index": 149, "error": "INTERACT_FAILED", "info": "{\"detail\":\"Error: Session does not exist\"}", "output": {"index": null, "status": "running", "result": null, "history": null}, "time": {"timestamp": 1699397194471, "str": "2023-11-07 22:46:34"}} {"index": 148, "error": "INTERACT_FAILED", "info": "{\"detail\":\"Error: Session does not exist\"}", "output": {"index": null, "status": "running", "result": null, "history": null}, "time": {"timestamp": 1699398002162, "str": "2023-11-07 23:00:02"}} {"index": 147, "error": "INTERACT_FAILED", "info": "{\"detail\":\"Error: Session does not exist\"}", "output": {"index": null, "status": "running", "result": null, "history": null}, "time": {"timestamp": 1699398438856, "str": "2023-11-07 23:07:18"}}


outputs/2023-11-07-22-36-41/config.yaml

assignments:

Screenshots or Terminal Copy&Paste image

zhc7 commented 8 months ago

Hi, @glad4enkonm. A possible reason is that the interaction session is killed due to time limit. I noticed that you are inferencing on CPU. If the LLM doesn't respond within four miniutes, the session might be killed. You may change the number https://github.com/THUDM/AgentBench/blob/adc728e073c7ba2934c5fbf05ca1eaa10cc2b21c/src/server/task_controller.py#L180 here to loosen this limit.

glad4enkonm commented 8 months ago

Hi, @zhc7, thanks for your reply. The error message is gone, although it stuck on 0/150. Is this the correct way to start the system?

source /home/myuser/anaconda3/bin/activate agent-bench
cd ~/AgentBench
python3 -m fastchat.serve.controller &
python3 -m fastchat.serve.model_worker --model-name vicuna-7b-v1.5 --device cpu --load-8bit &
python -m src.start_task -a &
sleep 60 && python -m src.assigner
glad4enkonm commented 8 months ago

Answering my own question, yes, it was the right way to start. Although it is better to start in separate console windows, comparing to background processes. The problem was that for a kg task and the selected llm most cases finish with a status "task limit reached" and on 24 cpu computer it takes about 20 min to run one test. Running the same task on 1 instance with gpu gives about 20-40s for one iteration.