hkust-nlp / AgentBoard

An Analytical Evaluation Board of Multi-turn LLM Agents
237 stars 24 forks source link

KeyError: No "additional_info" Field. #21

Closed zhanwenchen closed 3 weeks ago

zhanwenchen commented 3 weeks ago

Got this after git cloning the agentboard dataset. The only available keys are dict_keys(['id', 'goal', 'difficulty', 'subgoals']).

(ab) zhanwen@zhanwen-mini:~/ab$ python agentboard/eval_main.py     --cfg-path eval_configs/main_results_all_tasks.yaml     --tasks alfworld     --model gpt-3.5-turbo-0613     --log_path ./results/gpt-3.5-turbo-0613     --project_name evaluate-gpt-35-turbo-0613     --baseline_dir ./data/baseline_results
[2024-09-21 20:18:09,265] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
2024-09-21 20:18:10 | INFO | __main__ | Start loading language model
2024-09-21 20:18:10 | INFO | __main__ | Finished loading language model
2024-09-21 20:18:10 | INFO | __main__ | Wandb is not enabled
2024-09-21 20:18:10 | INFO | __main__ | Tested tasks: 
2024-09-21 20:18:10 | INFO | __main__ | Start evaluating task alfworld
Initializing AlfredTWEnv...
0it [00:00, ?it/s]
Overall we have 0 games in split=eval_out_of_distribution
Evaluating with 0 games
> /home/zhanwen/ab/agentboard/environment/alfworld/alfworld_env.py(34)__init__()
-> self.labeled_data[item["additional_info"]['description']] = item
(Pdb) item["additional_info"]
*** KeyError: 'additional_info'
(Pdb) item.keys()
dict_keys(['id', 'goal', 'difficulty', 'subgoals'])
zhanwenchen commented 3 weeks ago

Apparently the huggingface dataset is not the correct dataset that can be used in this codebase. Instead, I did

mv data data-old
mkdir data
wget https://huggingface.co/datasets/hkust-nlp/agentboard/resolve/main/data.tar.gz
tar -zxvf data.tar.gz

python agentboard/eval_main.py ...

And it now works!