web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
https://webarena.dev
Apache License 2.0
676 stars 103 forks source link

Improve Evaluation Annotation #44

Closed shuyanzhou closed 11 months ago

shuyanzhou commented 11 months ago

Update the evaluations:

  1. Fix annotation errors #24 #30
  2. Improve inaccurate locators
  3. Reuse string checkers in string_match in program_html
  4. Use more exact_match to avoid false positives
  5. Merge must_include strings which share the same URL and the locators