web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
https://webarena.dev
Apache License 2.0
676 stars 103 forks source link

Bug in the interaction between the environment and the action? #16

Closed njucckevin closed 11 months ago

njucckevin commented 1 year ago

I tried the minimal_example.py with some other actions, but the interaction seems strange. For example, for the task 156.json, the first action I choose is [xxx] link 'Starred 7' by click [xxx], this first step is ok in the chrome ui; the second action I choose is [yyyy] link 'Explore' by click [yyyy], but this step doesn't seem to worked in the ui; after that, I action again by click [yyyy] and this time it work well and bring me to the 'Explore' page. I noticed that when the actree_obs head with Tab {idx}, there maybe some problem with the next action. But different action seems have different behavior, some work and others not. This interaction confuse me a lot.

njucckevin commented 1 year ago

It seems that in some case the obs["text"] show the whole page (including scroll up and down), instead of the current page. For example, click link 'The A11Y Project / a11yproject.coma11yproject.contributor.me' in 156.json.

shuyanzhou commented 1 year ago

Thanks for reporting this @njucckevin.

after that, I action again by click [yyyy] and this time it work well and bring me to the 'Explore' page.

Can you try to increase the sleep time and see if this is a problem with loading? I tried to replicate the issue from my end and I did not encounter the same problem. Sometimes the page is not fully loaded yet when the code processes the accessibility tree observation, therefore there may be a mismatch.

show the whole page

I am looking into this

njucckevin commented 1 year ago

The first problem magically disappeared when I got home, everything was fine even at sleep=1.5. I will check it tomorrow in my office. (⊙o⊙)

njucckevin commented 1 year ago

The first problem seems to be related to the network environment.

  1. Using my own wifi, problem exist;
  2. Using my own wifi+vpn, work well;
  3. Using my office wifi or (+vpn), problem exist; Notice that my office wifi seems have it own vpn.
shuyanzhou commented 1 year ago

Thanks for the analysis!

Also a quick update, the second issue is a bug, I am fixing it