web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
https://webarena.dev
Apache License 2.0
632 stars 90 forks source link

Long cost time when getting observation during env.step/reset #109

Closed zijianma17 closed 1 month ago

zijianma17 commented 4 months ago

Hi, I have been testing on your wonderful datasets for months. Recently I plan to do more experiments on it. I do found that the _get_obs() takes a lot of time during the setting, whether in "reset" function or "step" function. Would you please give me some advices on how to accelerate the process? Maybe improve the computer hardware? Since it was acceptable on some small tests before, but currentlty I hope it could run faster. Thanks for your reply!

frankxu2004 commented 3 months ago

One thing is to disable headless browser mode and see what the browser is actually performing at the point. Also I think hardware is quite important especially if you are running on servers. Servers without any GPUs will prevent Chromium rendering engine from using hardware acceleration. You best bet is still to visually check the browser (headless=false).

wookayin commented 3 months ago

Probably dupe of #66

zijianma17 commented 3 months ago

One thing is to disable headless browser mode and see what the browser is actually performing at the point. Also I think hardware is quite important especially if you are running on servers. Servers without any GPUs will prevent Chromium rendering engine from using hardware acceleration. You best bet is still to visually check the browser (headless=false).

Thanks. But I do use headless=false to have a look at the browser. The issue is the same as @wookayin mentioned. It seems that there is still not a good way to solve it (e.g. using asyn instead, but need a lot of work).

zijianma17 commented 3 months ago
  1. The env.step needs more than 10s almost all the time in my computer. Any ideas about why you need about 2~3 seconds? maybe due to the hardware?
  2. The most time-consuming function is get_bounding_client_rect() just as the problem in #66 . (which is unacceptable for large amount of experiments). But I would like to try using "current_viewport_only=False", thus bounding of each node is in my opinon no longer needed. Am I right? Is this the only use of the boundings?
  3. However after I assign node_bound with [0 0 10 10 ] here and skip the get_bounding_client_rect(), the env runs indeed very fast but the element_id doesn't works anymore -> relevant dict may be constructed incorrectly. Could you give me some advice on that? Thanks and have a nice week!
shuyanzhou commented 3 months ago

The env.step needs more than 10s

Are you experimenting with our hosted websites? To test if it is an issue of your own device, you can modify the code here to random websites and test complicated ones (like Amazon) and simple ones (like Google search home page) and time it.

s = f"""page.goto("<change to different URLs>")
page.scroll(down)"""
action_seq = s.split("\n")

2 and 3

Due to the implementation, when current_viewport_only=False, the actions will not work. Similar to #102. I have started working on it

shuyanzhou commented 1 month ago

https://github.com/web-arena-x/webarena/issues/66#issuecomment-2145514211