Closed wookayin closed 1 month ago
Hi @wookayin, I wasn't able to reproduce the slowness from my end. I printed the time spent before and after env.step
in this script and the output is:
Time taken: 4.67 seconds Time taken: 3.58 seconds Time taken: 3.39 seconds
Could it be a problem with the local machine? Feel free to follow up with more information.
Yes that script also runs very slow for me. I don't think 4 seconds per environment step is a fast enough speed.
Can you please try the following?
$ pip install py-spy
$ py-spy record python scripts/collect_obs.py
You will be able to see some profiling results; if you can share that with me it'd be helpful. My guess is that given your timing information it's taking a lot of tome there too. I am attaching one for your reference; inspect.stack()
takes most of the time in my environments.
In my environments, playwright==1.32.1
and python=3.11.5
.
https://github.com/web-arena-x/webarena/assets/1009873/c24c297a-3fd8-4117-97ff-2dbdb9337119
Super sorry for the late reply. Here is the output from my end with playwright==1.32.1
and python==
3.10.12`
https://github.com/web-arena-x/webarena/assets/29911200/658863ce-bedc-4200-8ee9-84265dbd3c0d
So quite a lot of time is also spent on yours too, extracting the stack frame very redundantly with the exact same reason.
Although I am not experiencing significant slowness in terms of the wall-clock time (each fetch_accessibility_tree
took 2 to 3 secs.), I will look into the problem.
Basically, the current implementation to get the bounding box of each element might comprise the observation rendering efficiency. Although this is so far the most accurate way to get the bounding box I can think of. If you have any ideas, feel free to follow up.
cc. @frankxu2004 Thoughts?
Depending on the performance of a machine, it can be significantly slow. Even in your environments 2~3 steps/sec doesn't sound like a reasonable speed -- website interaction is very fast and a vast majority of time is wasted on "stacktrace" management to emulate coroutine in a synchronized fashion. It appears that synchronous playwright APIs are meant to be used mainly for debugging or some other prototyping purposes, not for the main use because of its poor performance.
Actually this is a problem of the underlying library microsoft/playwright. I think in principle one can avoid doing by using asynchronous APIs instead of synchornous APIs (AsyncScriptBrowserEnv
). Webarena will also benefit a lot by avoiding synchronous APIs, but user applications (e.g. implementing some agents in a research project) will need to make non-trivial efforts to fully migrate to the asynchronous APIs.
Hi @wookayin ! This is not a major blocker for us so we might not be able to spend a lot of time on it now, but we'd welcome a PR that makes things faster. Your approach seems reasonable.
Yes I agree with you and understand that! Thanks for your messages and help. If I can come up with some good workaround or improvements, I will be happy to contribute back.
Throwback response -- BrowserGym did a very nice implementation on this by injecting JS instead of using client callings.
I attempted to incorporate it into our codebase but found the observation difference made our previous results not reproducible. We decided to keep our current implementation, but feel free to check it out if it is still interesting to you.
I find the environment frustratingly slow, it takes around 10 seconds for a single step transition or just calling
env.reset()
once.Profiling tells us
A most of the time is spend on
fetch_page_accessibility_tree
, more preciselyself.get_bounding_client_rect
:https://github.com/web-arena-x/webarena/blob/main/browser_env/processors.py#L394-L396
and the root cause is playwright:
Note that this is called for EVERY node in the DOM tree, and very inefficient. Although it doesn't much make sense to me that stacktrace information needs to be used in playwright's sync_base API, have you run into this before? Would there be any workaround or known solution to make
fetch_page_accessibility_tree
more efficient?