web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
https://webarena.dev
Apache License 2.0
647 stars 94 forks source link

The fields require_login and require_reset in eval set are not properly assigned #98

Closed eyalis-google closed 3 months ago

eyalis-google commented 5 months ago

It seems that all 812 examples in test.raw.json have require_login set to true and require_reset set to false, is this on purpose? is there a plan to change that?

shuyanzhou commented 5 months ago

Hi @eyalis-google, sorry for any confusion, please disregard require_reset field. It's legacy. Please reset environments after each evaluation round for reproducibility.

We have implemented the evaluation script to perform automatic login whenever require_login is true