Open dangne opened 2 weeks ago
Hi! Please share your credentials by email and I will try to understand: alexandre.drouin@servicenow.com.
Thanks for the credentials. I'm currently running the installer on your instance. It appears to be a bit slower than usual, but it appears to be working.
I will let it run and let you know when it completes.
Your instance is fully set up now; there is no need to re-run on your side. I'm not sure what went wrong. It could have been network issues on your end or a temporary glitch at our data center.
Please re-open if you run into any issues while benchmarking.
Thank you for your help. I'm trying to run the following example script from the README file:
import random
from browsergym.core.env import BrowserEnv
from browsergym.workarena import ALL_WORKARENA_TASKS
from time import sleep
random.shuffle(ALL_WORKARENA_TASKS)
for task in ALL_WORKARENA_TASKS:
print("Task:", task)
# Instantiate a new environment
env = BrowserEnv(task_entrypoint=task,
headless=False)
env.reset()
# Cheat functions use Playwright to automatically solve the task
env.chat.add_message(role="assistant", msg="On it. Please wait...")
cheat_messages = []
env.task.cheat(env.page, cheat_messages)
# Send cheat messages to chat
for cheat_msg in cheat_messages:
env.chat.add_message(role=cheat_msg["role"], msg=cheat_msg["message"])
# Post solution to chat
env.chat.add_message(role="assistant", msg="I'm done!")
# Validate the solution
reward, stop, message, info = env.task.validate(env.page, cheat_messages)
if reward == 1:
env.chat.add_message(role="user", msg="Yes, that works. Thanks!")
else:
env.chat.add_message(role="user", msg=f"No, that doesn't work. {info.get('message', '')}")
sleep(3)
env.close()
But I got a similar error message as above (with different traceback):
Task: <class 'browsergym.workarena.tasks.compositional.DashboardRetrieveIncidentAndMeanRequestGoogleNexus7TaskL2'>
Traceback (most recent call last):
File "/Users/dang/projects/hero/try_workarena.py", line 15, in <module>
env.reset()
File "/Users/dang/.local/lib/python3.11/site-packages/browsergym/core/env.py", line 281, in reset
self.chat = Chat(
^^^^^
File "/Users/dang/.local/lib/python3.11/site-packages/browsergym/core/chat.py", line 35, in __init__
self.page = self.context.new_page()
^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dang/.local/lib/python3.11/site-packages/playwright/sync_api/_generated.py", line 13095, in new_page
return mapping.from_impl(self._sync(self._impl_obj.new_page()))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dang/.local/lib/python3.11/site-packages/playwright/_impl/_sync_base.py", line 109, in _sync
return task.result()
^^^^^^^^^^^^^
File "/Users/dang/.local/lib/python3.11/site-packages/playwright/_impl/_browser_context.py", line 281, in new_page
return from_channel(await self._channel.send("newPage"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dang/.local/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 61, in send
return await self._connection.wrap_api_call(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dang/.local/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 490, in wrap_api_call
return await cb()
^^^^^^^^^^
File "/Users/dang/.local/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 99, in inner_send
result = next(iter(done)).result()
^^^^^^^^^^^^^^^^^^^^^^^^^
playwright._impl._api_types.Error: Page closed
This is very strange. It seems to be an issue with your playwright installation. Could you tell me more about your setup?
Is this a remote server? Did you run playwright install
? Anything else you think might help.
@gasse does this issue ring a bell?
@dangne I can't reproduce your issue on my end. However, I'm running into other issues later in the demo. I will address these ASAP. cc @jardinetsouffleton
Okay, I fixed the problem by updating playwright to the latest version with pip install -U playwright
and run playwright install
again.
Can you give this demo script a shot please?
import random
from browsergym.core.env import BrowserEnv
from browsergym.workarena import ALL_WORKARENA_TASKS
from browsergym.workarena.tasks.compositional.base import CompositionalTask
from time import sleep
random.shuffle(ALL_WORKARENA_TASKS)
for task in ALL_WORKARENA_TASKS:
print("Task:", task)
# Instantiate a new environment
env = BrowserEnv(task_entrypoint=task,
headless=False)
env.reset()
# Cheat functions use Playwright to automatically solve the task
env.chat.add_message(role="assistant", msg="On it. Please wait...")
cheat_messages = []
if isinstance(env.task, CompositionalTask):
# Need to cheat for all subtasks
for i in range(len(env.task.subtasks)):
env.task.cheat(env.page, cheat_messages, i)
else:
env.task.cheat(env.page, cheat_messages)
# Send cheat messages to chat
for cheat_msg in cheat_messages:
env.chat.add_message(role=cheat_msg["role"], msg=cheat_msg["message"])
# Post solution to chat
env.chat.add_message(role="assistant", msg="I'm done!")
# Validate the solution
reward, stop, message, info = env.task.validate(env.page, cheat_messages)
if reward == 1:
env.chat.add_message(role="user", msg="Yes, that works. Thanks!")
else:
env.chat.add_message(role="user", msg=f"No, that doesn't work. {info.get('message', '')}")
sleep(3)
env.close()
Yes, it's (kinda) work. The browser and chat windows appear normally but now I got timeout errors at the env.task.cheat(env.page, cheat_messages, i)
step for every task.
Task: <class 'browsergym.workarena.tasks.compositional.TwoChangesFixBasicVariedRiskChangeRequestSchedulingTaskL3'>
Traceback (most recent call last):
File "/Users/dang/projects/hero/try_workarena.py", line 76, in <module>
env.task.cheat(env.page, cheat_messages, i)
File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/browsergym/workarena/tasks/compositional/base.py", line 198, in cheat
self.subtasks[subtask_idx].cheat(page, chat_messages)
File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/browsergym/workarena/tasks/list.py", line 667, in cheat
self._wait_for_ready(page)
File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/browsergym/workarena/tasks/list.py", line 189, in _wait_for_ready
page.wait_for_function(
File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/playwright/sync_api/_generated.py", line 11537, in wait_for_function
self._sync(
File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/playwright/_impl/_sync_base.py", line 115, in _sync
return task.result()
^^^^^^^^^^^^^
File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/playwright/_impl/_page.py", line 1083, in wait_for_function
return await self._main_frame.wait_for_function(**locals_to_params(locals()))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/playwright/_impl/_frame.py", line 771, in wait_for_function
return from_channel(await self._channel.send("waitForFunction", params))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 59, in send
return await self._connection.wrap_api_call(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 520, in wrap_api_call
raise rewrite_error(error, f"{parsed_st['apiName']}: {error}") from None
playwright._impl._errors.TimeoutError: Page.wait_for_function: Timeout 30000ms exceeded.
I was able to reproduce. Not sure what is causing this. Will look into it asap.
@jardinetsouffleton this issue seems to be limited to list pages. Inspecting the js console, it looks like there is an issue with the javascript that checks if the page load is complete. This requires further investigation to understand what changed in the pages for this issue to surface.
Playright times out on the "wait for ready" command. If there are javascript issues, then the ready flag will never be set and thus timeout will occur.
@dangne We were able to track down the issue to an update that was made to playwright. We are still figuring out how to fix the issue, but in the meantime, downgrading to pip install playwright==1.44.0
should fix it (note that you will need Python <= 3.12).
Can you please give it a shot and let us know?
Hi, everything ran smoothly this time! However, for 2 out of 3 tasks, the rewards did not equal 1. Is this a bug?
Glad it ran this time!
Would you be able to tell me which tasks these were?
We validated all the tasks in the benchmark but this could be due to another playwright glitch.
I have been trying to run the workarena-install
command for a while but been facing the same issue again and again!
workarena-install
INFO:root:
██ ██ ██████ ██████ ██ ██ █████ ██████ ███████ ███ ██ █████
██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ████ ██ ██ ██
██ █ ██ ██ ██ ██████ █████ ███████ ██████ █████ ██ ██ ██ ███████
██ ███ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██
███ ███ ██████ ██ ██ ██ ██ ██ ██ ██ ██ ███████ ██ ████ ██ ██
Instance: https://dev271536.service-now.com
Previous installation: never
INFO:root:URL login enabled.
INFO:root:Setting default home page
INFO:root:Guided tours disabled.
INFO:root:Analytics popups disabled.
INFO:root:Welcome help popup disabled.
INFO:root:Installing custom UI themes...
INFO:root:Uploading update set...
INFO:root:Applying update set...
INFO:root:... WorkArena UI Themes
INFO:root:Setting default UI theme
INFO:root:Wiping all system admin preferences
INFO:root:... Deleting all preferences
INFO:root:...... deleting workspace.showAgentAssist
INFO:root:...... deleting workspace.showRibbon
INFO:root:...... deleting sys_update_set
INFO:root:...... deleting menu.08771d0cc0a8016401f604303b94b999.expanded
...
...
INFO:root:Setting up visible list columns...
INFO:root:... Creating a new user account to validate list columns
INFO:root:... Setting up default view for list alm_asset
INFO:root:...... Fetching default view for list alm_asset...
INFO:root:...... Fetching existing columns for default view of list alm_asset...
INFO:root:...... Deleting existing columns for default view of list alm_asset...
INFO:root:...... Adding expected columns to default view of list alm_asset...
INFO:root:......... asset_tag
INFO:root:......... model.display_name
INFO:root:......... model_category
INFO:root:......... sys_class_name
INFO:root:......... assigned_to
INFO:root:......... location
INFO:root:......... company
INFO:root:......... department
INFO:root:......... install_status
INFO:root:......... warranty_expiration
INFO:root:...... Done.
INFO:root:All columns properly displayed for /now/nav/ui/classic/params/target/alm_asset_list.do.
INFO:root:... Setting up default view for list alm_hardware
INFO:root:...... Fetching default view for list alm_hardware...
INFO:root:...... Fetching existing columns for default view of list alm_hardware...
INFO:root:...... Deleting existing columns for default view of list alm_hardware...
INFO:root:...... Adding expected columns to default view of list alm_hardware...
INFO:root:......... display_name
INFO:root:......... model_category
INFO:root:......... serial_number
INFO:root:......... assigned_to
INFO:root:......... company
INFO:root:......... cost_center
INFO:root:......... install_status
INFO:root:......... ci
INFO:root:......... purchase_date
INFO:root:......... warranty_expiration
INFO:root:...... Done.
INFO:root:All columns properly displayed for /now/nav/ui/classic/params/target/alm_hardware_list.do.
INFO:root:... Setting up default view for list change_request
INFO:root:...... Fetching default view for list change_request...
INFO:root:...... Fetching existing columns for default view of list change_request...
INFO:root:...... Deleting existing columns for default view of list change_request...
INFO:root:...... Adding expected columns to default view of list change_request...
INFO:root:......... number
INFO:root:......... short_description
INFO:root:......... risk
INFO:root:......... impact
INFO:root:......... priority
INFO:root:......... assigned_to
INFO:root:......... start_date
INFO:root:......... end_date
INFO:root:......... implementation_plan
INFO:root:......... approval
INFO:root:...... Done.
INFO:root:All columns properly displayed for /now/nav/ui/classic/params/target/change_request_list.do.
INFO:root:... Setting up default view for list incident
INFO:root:...... Fetching default view for list incident...
INFO:root:...... Fetching existing columns for default view of list incident...
INFO:root:...... Deleting existing columns for default view of list incident...
INFO:root:...... Adding expected columns to default view of list incident...
INFO:root:......... number
INFO:root:......... caller_id
INFO:root:......... category
INFO:root:......... priority
INFO:root:......... impact
INFO:root:......... state
INFO:root:......... short_description
INFO:root:......... assigned_to
INFO:root:......... company
INFO:root:......... sys_created_on
INFO:root:...... Done.
INFO:root:An error occurred. Retrying...
INFO:root:URL login enabled.
INFO:root:Setting default home page
INFO:root:Guided tours disabled.
INFO:root:Analytics popups disabled.
INFO:root:Welcome help popup disabled.
INFO:root:Installing custom UI themes...
INFO:root:Uploading update set...
INFO:root:Applying update set...
INFO:root:An error occurred. Retrying...
INFO:root:URL login enabled.
INFO:root:Setting default home page
INFO:root:Guided tours disabled.
INFO:root:Analytics popups disabled.
INFO:root:Welcome help popup disabled.
INFO:root:Installing custom UI themes...
INFO:root:Uploading update set...
INFO:root:Applying update set...
Traceback (most recent call last):
File "/Users/nidhirbhavsar/.pyenv/versions/workarena_env/bin/workarena-install", line 8, in <module>
sys.exit(main())
^^^^^^
File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/browsergym/workarena/install.py", line 1075, in main
setup()
File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/tenacity/__init__.py", line 336, in wrapped_f
return copy(f, *args, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/tenacity/__init__.py", line 475, in __call__
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/tenacity/__init__.py", line 376, in iter
result = action(retry_state)
^^^^^^^^^^^^^^^^^^^
File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/tenacity/__init__.py", line 418, in exc_check
raise retry_exc.reraise()
^^^^^^^^^^^^^^^^^^^
File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/tenacity/__init__.py", line 185, in reraise
raise self.last_attempt.result()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/lib/python3.12/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/tenacity/__init__.py", line 478, in __call__
result = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/browsergym/workarena/install.py", line 1025, in setup
setup_ui_themes()
File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/browsergym/workarena/install.py", line 837, in setup_ui_themes
_install_update_set(path=UI_THEMES_UPDATE_SET["update_set"], name=UI_THEMES_UPDATE_SET["name"])
File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/browsergym/workarena/install.py", line 138, in _install_update_set
update_set = table_api_call(
^^^^^^^^^^^^^^^
IndexError: list index out of range
(runtime: 35m 55.78s)
Any ideas for troubleshooting this?
@dangne We were able to track down the issue to an update that was made to playwright. We are still figuring out how to fix the issue, but in the meantime, downgrading to
pip install playwright==1.44.0
should fix it (note that you will need Python <= 3.12).Can you please give it a shot and let us know?
Has there been any fixes for this lately? I encountered the same issue, however, my system crashes with playwright=1.44.0
. Had the same issue with the Webarena as well in the past and nothing except playwright>=1.47.0 w/ python3.12
works on my machine (strange right?!)
Same here! I've tried 1) python 3.13 + playwright 1.49.0 ⛔️ 2) python 3.12 + playwright 1.44.0 ⛔️ 3) python 3.12 + playwright 1.47.0 ✅ and third option worked as mentioned above!
Hi, I am having trouble at the
workarena-install
step. Below is the traceback:Thanks so much for you help!