web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
https://webarena.dev
Apache License 2.0
647 stars 94 forks source link

Success/Fail Annotations for Execution Traces #59

Closed Jiayi-Pan closed 8 months ago

Jiayi-Pan commented 8 months ago

Thank you for the amazing work! I am currently working to inspect the execution traces here.

However, these traces do not come with success/fail annotations, making it challenging to discern the results of trajectory. Do you happen to have this accompanying data? If so, could you please share it?

Best :-)

shuyanzhou commented 8 months ago

The logs are in the merge_log.txt. An entry is like

2023-09-28 01:01:55,517 - INFO - [Config file]: /var/folders/tj/3vs6n8b53wj6q691gkb1fd3c0000gn/T/tmp1m5j_5pp/2.json
2023-09-28 01:01:55,517 - INFO - [Intent]: What is the top-1 best-selling product type in Quarter 1 2022
2023-09-28 01:02:57,667 - INFO - [Result] (FAIL) /var/folders/tj/3vs6n8b53wj6q691gkb1fd3c0000gn/T/tmp1m5j_5pp/2.json

where [Result] (FAIL) indicates the execution failed and [Result] (PASS) indicates the execution succeeded.

We also upload the newest trajectories, checkout #61

Jiayi-Pan commented 8 months ago

Awesome thanks!