Yifan-Song793 / ETO

Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)
https://arxiv.org/abs/2403.02502
88 stars 9 forks source link

expert trajectories是如何采集的? #2

Closed Fu-Dayuan closed 5 months ago

Fu-Dayuan commented 6 months ago

如题, expert trajectories是通过ChatGPT(or GPT4)采样得到的,还是llama-chat呢?我观察到即使是SFT的版本也比llama-chat版本高很多

Yifan-Song793 commented 6 months ago

您好,感谢对我们工作的关注!

yananchen1989 commented 5 months ago

你好,请问可否提供这个项目里训练使用的expert trajectory 吗?

不仅仅包括webshop作者提供的标注 (https://drive.google.com/file/d/1GWC8UlUzfT9PRTRxgYOwuKSJp4hyV1dp/view

谢谢。

Yifan-Song793 commented 5 months ago

您好,在setup.sh中会自动下载 expert trajectory,包括 WebShop, ScienceWorld, ALFWorld 三个环境的 expert trajectory,也可以在这里进行下载:https://drive.google.com/file/d/1YbhbL8RhQGDWFv5y6k1qgwRqSyFFsao8/view?usp=drive_link