ServiceNow / BrowserGym

BrowserGym, a gym environment for web task automation in the Chromium browser.
Other
287 stars 39 forks source link

Bug when resetting environment for certain tasks #158

Open rutgercap opened 3 weeks ago

rutgercap commented 3 weeks ago

Hi all,

I've been using browsergym to eval my agent. However, with certain tasks a colleague and me are both running into the same errors being raised. It happens on a regular basis and we can't seem to fix it. We've tried it in multiple environments (our macbooks and a remote desktop), but it seems to happen everywhere.

Has this error occurred before? Would you perhaps know what the problem is?

I'm not sure which information is relevant to help clarify the problem so please let me know if any other information is needed.

Thanks in advance. Kind regards, R

Starting task 1/33: <class 'browsergym.workarena.tasks.list.FilterServiceCatalogItemListTask'>
Traceback (most recent call last):
  File "/Users/rutgercappendijk/Documents/twin/monorepo/apps/agent_extension/agent/src/browsergym/browsergym_evaluation.py", line 270, in <module>
    main()
  File "/Users/rutgercappendijk/Documents/twin/monorepo/apps/agent_extension/agent/src/browsergym/browsergym_evaluation.py", line 232, in main
    obs, _ = env.reset()
             ^^^^^^^^^^^
  File "/Users/rutgercappendijk/Library/Caches/pypoetry/virtualenvs/agent-0Q5EG4EM-py3.12/lib/python3.12/site-packages/browsergym/core/env.py", line 269, in reset
    goal, task_info = self.task.setup(page=self.page)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rutgercappendijk/Library/Caches/pypoetry/virtualenvs/agent-0Q5EG4EM-py3.12/lib/python3.12/site-packages/browsergym/workarena/tasks/base.py", line 153, in setup
    self.start(page)
  File "/Users/rutgercappendijk/Library/Caches/pypoetry/virtualenvs/agent-0Q5EG4EM-py3.12/lib/python3.12/site-packages/browsergym/workarena/tasks/list.py", line 557, in start
    self._wait_for_ready(page)
  File "/Users/rutgercappendijk/Library/Caches/pypoetry/virtualenvs/agent-0Q5EG4EM-py3.12/lib/python3.12/site-packages/browsergym/workarena/tasks/list.py", line 189, in _wait_for_ready
    page.wait_for_function(
  File "/Users/rutgercappendijk/Library/Caches/pypoetry/virtualenvs/agent-0Q5EG4EM-py3.12/lib/python3.12/site-packages/playwright/sync_api/_generated.py", line 11298, in wait_for_function
    self._sync(
  File "/Users/rutgercappendijk/Library/Caches/pypoetry/virtualenvs/agent-0Q5EG4EM-py3.12/lib/python3.12/site-packages/playwright/_impl/_sync_base.py", line 115, in _sync
    return task.result()
           ^^^^^^^^^^^^^
  File "/Users/rutgercappendijk/Library/Caches/pypoetry/virtualenvs/agent-0Q5EG4EM-py3.12/lib/python3.12/site-packages/playwright/_impl/_page.py", line 1032, in wait_for_function
    return await self._main_frame.wait_for_function(**locals_to_params(locals()))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rutgercappendijk/Library/Caches/pypoetry/virtualenvs/agent-0Q5EG4EM-py3.12/lib/python3.12/site-packages/playwright/_impl/_frame.py", line 772, in wait_for_function
    return from_channel(await self._channel.send("waitForFunction", params))
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rutgercappendijk/Library/Caches/pypoetry/virtualenvs/agent-0Q5EG4EM-py3.12/lib/python3.12/site-packages/playwright/_impl/_connection.py", line 59, in send
    return await self._connection.wrap_api_call(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rutgercappendijk/Library/Caches/pypoetry/virtualenvs/agent-0Q5EG4EM-py3.12/lib/python3.12/site-packages/playwright/_impl/_connection.py", line 514, in wrap_api_call
    raise rewrite_error(error, f"{parsed_st['apiName']}: {error}") from None
playwright._impl._errors.TimeoutError: Page.wait_for_function: Timeout 30000ms exceeded.
gasse commented 3 weeks ago

Hi @rutgercap , thanks for raising this issue! Does it happen systematically (with environment seed fixed), or is it stochastic? Does the error go away if you re-run the episode?

If this happens only once in a while at random then my guess would be network or server issues. Unfortunately when running things on the web things are always a bit stochastic. A potential fix could be to request a fresh ServiceNow instance, or maybe changing your connection or using a VPN. Please let us know how it goes.

PS: did you try AgentLab? It has an easy mechanism to re-run episodes that failed due to this kind of error (environment crash).

rutgercap commented 1 week ago

Hi @gasse , thanks for your quick response and apologies for my delayed response. I wasn't working on this project for a while so I didn't have the time to check your questions.

It happens stochastically but very often. I have a script that just sets up every task 3 times. Initially everything works fine (first 7 tasks) but the success rate rapidly decreases to a tiny percentage. After the first 7 it fails about 6 times and it quickly deteriorates to not being able to setup any task basically.

The most common error messages is: HTTPSConnectionPool(host='dev199980.service-now.com', port=443): Max retries exceeded with url: /api/now/table/change_request?sysparm_query=sys_id%3D9ba99d0b838d961003d41f65eeaad355 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x17fc83fe0>: Failed to establish a new connection: [Errno 61] Connection refused')

In waves we also get a lot of: ServiceNow instance at https://dev199980.service-now.com is not reachable. Please check the URL. When I go the url in the browser it also gives me an error page with failing network requests (status codes in inspector are FAILED). If I refresh a few times I can log back in again, but I've noticed that many of the numbers in the default dashboard are errors.

Could it be that the dev env is overloaded? Is this a common problem?

I've seen that servicenow offers on prem installations, would it be possible to get workarena as a docker container so we can bypass the connection errors?

rutgercap commented 1 week ago

Also good to note is that I had a wave of this error yesterday morning: ServiceNow instance at https://dev199980.service-now.com is not reachable. Please check the URL.

This was not solved after creating a new instance. Also after creating a new account with instance it wasn't solved. Was there something wrong with the servers? Perhaps Is my IP being rate limited?

aldro61 commented 1 week ago

Hi @rutgercap!

Docker: I looked into this a while back. Unfortunately, it is not possible as this would involve shipping the whole product code base to WorkArena users. We are looking for ways to avoid using "personal developer instances", but this will take a little while.

About the errors you are experiencing:

playwright._impl._errors.TimeoutError: Page.wait_for_function: Timeout 30000ms exceeded.

This type of error can occur when the agent tries to click on controls that are not really on the page (e.g., hallucinated). However, here the error happens while the code waits for the page to be loaded. This suggests that it is caused by latency that delays loading and causes the code to timeout.

Are you running multiple evaluations in parallel? If yes, try reducing the number of parallel runs. This has usually solved it for me.

ServiceNow instance at https://dev199980.service-now.com is not reachable. Please check the URL.

This suggests a network issue. I've seen it happen in the past due to a problem at the level of ServiceNow's ISP. In that case, it resolved on its own. It might also be due to rate limiting. I reached out to the team that manages the developer instances to know if they enforce such rates. I will get back to you ASAP with the answer.

Meanwhile, you can try releasing this instance and creating a new one. I've observed that some are slower than others. Let me know if this helps.

aldro61 commented 5 days ago

@rutgercap I confirm that no rate limiting is applied to the dev instances.