ServiceNow / WorkArena

WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
https://servicenow.github.io/WorkArena/
Other
130 stars 13 forks source link

Error in workarena-install #50

Open dangne opened 2 weeks ago

dangne commented 2 weeks ago

Hi, I am having trouble at the workarena-install step. Below is the traceback:

Instance: https://dev257112.service-now.com
Previous installation: never

INFO:root:An error occurred. Retrying...
INFO:root:An error occurred. Retrying...
Traceback (most recent call last):
  File "/Users/dang/.local/bin/workarena-install", line 8, in <module>
    sys.exit(main())
  File "/Users/dang/.local/lib/python3.10/site-packages/browsergym/workarena/install.py", line 1075, in main
    setup()
  File "/Users/dang/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 336, in wrapped_f
    return copy(f, *args, **kw)
  File "/Users/dang/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 475, in __call__
    do = self.iter(retry_state=retry_state)
  File "/Users/dang/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 376, in iter
    result = action(retry_state)
  File "/Users/dang/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 418, in exc_check
    raise retry_exc.reraise()
  File "/Users/dang/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 185, in reraise
    raise self.last_attempt.result()
  File "/Users/dang/miniconda3/envs/hero/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/Users/dang/miniconda3/envs/hero/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/Users/dang/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 478, in __call__
    result = fn(*args, **kwargs)
  File "/Users/dang/.local/lib/python3.10/site-packages/browsergym/workarena/install.py", line 1007, in setup
    if not check_instance_release_support():
  File "/Users/dang/.local/lib/python3.10/site-packages/browsergym/workarena/install.py", line 783, in check_instance_release_support
    version_info = instance.release_version
  File "/Users/dang/.local/lib/python3.10/site-packages/browsergym/workarena/instance.py", line 113, in release_version
    ui_login(self, page)
  File "/Users/dang/.local/lib/python3.10/site-packages/browsergym/workarena/utils.py", line 58, in ui_login
    page.goto(instance.snow_url)
  File "/Users/dang/.local/lib/python3.10/site-packages/playwright/sync_api/_generated.py", line 9303, in goto
    self._sync(
  File "/Users/dang/.local/lib/python3.10/site-packages/playwright/_impl/_sync_base.py", line 109, in _sync
    return task.result()
  File "/Users/dang/.local/lib/python3.10/site-packages/playwright/_impl/_page.py", line 473, in goto
    return await self._main_frame.goto(**locals_to_params(locals()))
  File "/Users/dang/.local/lib/python3.10/site-packages/playwright/_impl/_frame.py", line 138, in goto
    await self._channel.send("goto", locals_to_params(locals()))
  File "/Users/dang/.local/lib/python3.10/site-packages/playwright/_impl/_connection.py", line 61, in send
    return await self._connection.wrap_api_call(
  File "/Users/dang/.local/lib/python3.10/site-packages/playwright/_impl/_connection.py", line 490, in wrap_api_call
    return await cb()
  File "/Users/dang/.local/lib/python3.10/site-packages/playwright/_impl/_connection.py", line 99, in inner_send
    result = next(iter(done)).result()
playwright._impl._api_types.Error: Page closed
=========================== logs ===========================
navigating to "https://dev257112.service-now.com/", waiting until "load"
============================================================

Thanks so much for you help!

aldro61 commented 2 weeks ago

Hi! Please share your credentials by email and I will try to understand: alexandre.drouin@servicenow.com.

aldro61 commented 2 weeks ago

Thanks for the credentials. I'm currently running the installer on your instance. It appears to be a bit slower than usual, but it appears to be working. image

I will let it run and let you know when it completes.

aldro61 commented 2 weeks ago
image

Your instance is fully set up now; there is no need to re-run on your side. I'm not sure what went wrong. It could have been network issues on your end or a temporary glitch at our data center.

Please re-open if you run into any issues while benchmarking.

dangne commented 2 weeks ago

Thank you for your help. I'm trying to run the following example script from the README file:

import random

from browsergym.core.env import BrowserEnv
from browsergym.workarena import ALL_WORKARENA_TASKS
from time import sleep

random.shuffle(ALL_WORKARENA_TASKS)
for task in ALL_WORKARENA_TASKS:
    print("Task:", task)

    # Instantiate a new environment
    env = BrowserEnv(task_entrypoint=task,
                    headless=False)
    env.reset()

    # Cheat functions use Playwright to automatically solve the task
    env.chat.add_message(role="assistant", msg="On it. Please wait...")
    cheat_messages = []
    env.task.cheat(env.page, cheat_messages)

    # Send cheat messages to chat
    for cheat_msg in cheat_messages:
        env.chat.add_message(role=cheat_msg["role"], msg=cheat_msg["message"])

    # Post solution to chat
    env.chat.add_message(role="assistant", msg="I'm done!")

    # Validate the solution
    reward, stop, message, info = env.task.validate(env.page, cheat_messages)
    if reward == 1:
        env.chat.add_message(role="user", msg="Yes, that works. Thanks!")
    else:
        env.chat.add_message(role="user", msg=f"No, that doesn't work. {info.get('message', '')}")

    sleep(3)
    env.close()

But I got a similar error message as above (with different traceback):

Task: <class 'browsergym.workarena.tasks.compositional.DashboardRetrieveIncidentAndMeanRequestGoogleNexus7TaskL2'>
Traceback (most recent call last):
  File "/Users/dang/projects/hero/try_workarena.py", line 15, in <module>
    env.reset()
  File "/Users/dang/.local/lib/python3.11/site-packages/browsergym/core/env.py", line 281, in reset
    self.chat = Chat(
                ^^^^^
  File "/Users/dang/.local/lib/python3.11/site-packages/browsergym/core/chat.py", line 35, in __init__
    self.page = self.context.new_page()
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dang/.local/lib/python3.11/site-packages/playwright/sync_api/_generated.py", line 13095, in new_page
    return mapping.from_impl(self._sync(self._impl_obj.new_page()))
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dang/.local/lib/python3.11/site-packages/playwright/_impl/_sync_base.py", line 109, in _sync
    return task.result()
           ^^^^^^^^^^^^^
  File "/Users/dang/.local/lib/python3.11/site-packages/playwright/_impl/_browser_context.py", line 281, in new_page
    return from_channel(await self._channel.send("newPage"))
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dang/.local/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 61, in send
    return await self._connection.wrap_api_call(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dang/.local/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 490, in wrap_api_call
    return await cb()
           ^^^^^^^^^^
  File "/Users/dang/.local/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 99, in inner_send
    result = next(iter(done)).result()
             ^^^^^^^^^^^^^^^^^^^^^^^^^
playwright._impl._api_types.Error: Page closed
aldro61 commented 2 weeks ago

This is very strange. It seems to be an issue with your playwright installation. Could you tell me more about your setup?

Is this a remote server? Did you run playwright install? Anything else you think might help.

@gasse does this issue ring a bell?

aldro61 commented 2 weeks ago

@dangne I can't reproduce your issue on my end. However, I'm running into other issues later in the demo. I will address these ASAP. cc @jardinetsouffleton

dangne commented 2 weeks ago

Okay, I fixed the problem by updating playwright to the latest version with pip install -U playwright and run playwright install again.

aldro61 commented 2 weeks ago

Can you give this demo script a shot please?

import random

from browsergym.core.env import BrowserEnv
from browsergym.workarena import ALL_WORKARENA_TASKS
from browsergym.workarena.tasks.compositional.base import CompositionalTask
from time import sleep

random.shuffle(ALL_WORKARENA_TASKS)
for task in ALL_WORKARENA_TASKS:
    print("Task:", task)

    # Instantiate a new environment
    env = BrowserEnv(task_entrypoint=task,
                    headless=False)
    env.reset()

    # Cheat functions use Playwright to automatically solve the task
    env.chat.add_message(role="assistant", msg="On it. Please wait...")
    cheat_messages = []
    if isinstance(env.task, CompositionalTask):
        # Need to cheat for all subtasks
        for i in range(len(env.task.subtasks)):
            env.task.cheat(env.page, cheat_messages, i)
    else:
        env.task.cheat(env.page, cheat_messages)

    # Send cheat messages to chat
    for cheat_msg in cheat_messages:
        env.chat.add_message(role=cheat_msg["role"], msg=cheat_msg["message"])

    # Post solution to chat
    env.chat.add_message(role="assistant", msg="I'm done!")

    # Validate the solution
    reward, stop, message, info = env.task.validate(env.page, cheat_messages)
    if reward == 1:
        env.chat.add_message(role="user", msg="Yes, that works. Thanks!")
    else:
        env.chat.add_message(role="user", msg=f"No, that doesn't work. {info.get('message', '')}")

    sleep(3)
    env.close()
dangne commented 2 weeks ago

Yes, it's (kinda) work. The browser and chat windows appear normally but now I got timeout errors at the env.task.cheat(env.page, cheat_messages, i) step for every task.

Task: <class 'browsergym.workarena.tasks.compositional.TwoChangesFixBasicVariedRiskChangeRequestSchedulingTaskL3'>
Traceback (most recent call last):
  File "/Users/dang/projects/hero/try_workarena.py", line 76, in <module>
    env.task.cheat(env.page, cheat_messages, i)
  File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/browsergym/workarena/tasks/compositional/base.py", line 198, in cheat
    self.subtasks[subtask_idx].cheat(page, chat_messages)
  File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/browsergym/workarena/tasks/list.py", line 667, in cheat
    self._wait_for_ready(page)
  File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/browsergym/workarena/tasks/list.py", line 189, in _wait_for_ready
    page.wait_for_function(
  File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/playwright/sync_api/_generated.py", line 11537, in wait_for_function
    self._sync(
  File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/playwright/_impl/_sync_base.py", line 115, in _sync
    return task.result()
           ^^^^^^^^^^^^^
  File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/playwright/_impl/_page.py", line 1083, in wait_for_function
    return await self._main_frame.wait_for_function(**locals_to_params(locals()))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/playwright/_impl/_frame.py", line 771, in wait_for_function
    return from_channel(await self._channel.send("waitForFunction", params))
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 59, in send
    return await self._connection.wrap_api_call(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dang/miniconda3/envs/hero/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 520, in wrap_api_call
    raise rewrite_error(error, f"{parsed_st['apiName']}: {error}") from None
playwright._impl._errors.TimeoutError: Page.wait_for_function: Timeout 30000ms exceeded.
aldro61 commented 2 weeks ago

I was able to reproduce. Not sure what is causing this. Will look into it asap.

aldro61 commented 2 weeks ago

@jardinetsouffleton this issue seems to be limited to list pages. Inspecting the js console, it looks like there is an issue with the javascript that checks if the page load is complete. This requires further investigation to understand what changed in the pages for this issue to surface.

image

Playright times out on the "wait for ready" command. If there are javascript issues, then the ready flag will never be set and thus timeout will occur.

aldro61 commented 2 weeks ago

@dangne We were able to track down the issue to an update that was made to playwright. We are still figuring out how to fix the issue, but in the meantime, downgrading to pip install playwright==1.44.0 should fix it (note that you will need Python <= 3.12).

Can you please give it a shot and let us know?

dangne commented 2 weeks ago

Hi, everything ran smoothly this time! However, for 2 out of 3 tasks, the rewards did not equal 1. Is this a bug?

aldro61 commented 2 weeks ago

Glad it ran this time!

Would you be able to tell me which tasks these were?

We validated all the tasks in the benchmark but this could be due to another playwright glitch.

Nid989 commented 6 days ago

I have been trying to run the workarena-install command for a while but been facing the same issue again and again!

workarena-install
INFO:root:

██     ██  ██████  ██████  ██   ██  █████  ██████  ███████ ███    ██  █████
██     ██ ██    ██ ██   ██ ██  ██  ██   ██ ██   ██ ██      ████   ██ ██   ██
██  █  ██ ██    ██ ██████  █████   ███████ ██████  █████   ██ ██  ██ ███████
██ ███ ██ ██    ██ ██   ██ ██  ██  ██   ██ ██   ██ ██      ██  ██ ██ ██   ██
 ███ ███   ██████  ██   ██ ██   ██ ██   ██ ██   ██ ███████ ██   ████ ██   ██

Instance: https://dev271536.service-now.com
Previous installation: never

INFO:root:URL login enabled.
INFO:root:Setting default home page
INFO:root:Guided tours disabled.
INFO:root:Analytics popups disabled.
INFO:root:Welcome help popup disabled.
INFO:root:Installing custom UI themes...
INFO:root:Uploading update set...
INFO:root:Applying update set...
INFO:root:... WorkArena UI Themes
INFO:root:Setting default UI theme
INFO:root:Wiping all system admin preferences
INFO:root:... Deleting all preferences
INFO:root:...... deleting workspace.showAgentAssist
INFO:root:...... deleting workspace.showRibbon
INFO:root:...... deleting sys_update_set
INFO:root:...... deleting menu.08771d0cc0a8016401f604303b94b999.expanded
...
...
INFO:root:Setting up visible list columns...
INFO:root:... Creating a new user account to validate list columns
INFO:root:... Setting up default view for list alm_asset
INFO:root:...... Fetching default view for list alm_asset...
INFO:root:...... Fetching existing columns for default view of list alm_asset...
INFO:root:...... Deleting existing columns for default view of list alm_asset...
INFO:root:...... Adding expected columns to default view of list alm_asset...
INFO:root:......... asset_tag
INFO:root:......... model.display_name
INFO:root:......... model_category
INFO:root:......... sys_class_name
INFO:root:......... assigned_to
INFO:root:......... location
INFO:root:......... company
INFO:root:......... department
INFO:root:......... install_status
INFO:root:......... warranty_expiration
INFO:root:...... Done.
INFO:root:All columns properly displayed for /now/nav/ui/classic/params/target/alm_asset_list.do.
INFO:root:... Setting up default view for list alm_hardware
INFO:root:...... Fetching default view for list alm_hardware...
INFO:root:...... Fetching existing columns for default view of list alm_hardware...
INFO:root:...... Deleting existing columns for default view of list alm_hardware...
INFO:root:...... Adding expected columns to default view of list alm_hardware...
INFO:root:......... display_name
INFO:root:......... model_category
INFO:root:......... serial_number
INFO:root:......... assigned_to
INFO:root:......... company
INFO:root:......... cost_center
INFO:root:......... install_status
INFO:root:......... ci
INFO:root:......... purchase_date
INFO:root:......... warranty_expiration
INFO:root:...... Done.
INFO:root:All columns properly displayed for /now/nav/ui/classic/params/target/alm_hardware_list.do.
INFO:root:... Setting up default view for list change_request
INFO:root:...... Fetching default view for list change_request...
INFO:root:...... Fetching existing columns for default view of list change_request...
INFO:root:...... Deleting existing columns for default view of list change_request...
INFO:root:...... Adding expected columns to default view of list change_request...
INFO:root:......... number
INFO:root:......... short_description
INFO:root:......... risk
INFO:root:......... impact
INFO:root:......... priority
INFO:root:......... assigned_to
INFO:root:......... start_date
INFO:root:......... end_date
INFO:root:......... implementation_plan
INFO:root:......... approval
INFO:root:...... Done.
INFO:root:All columns properly displayed for /now/nav/ui/classic/params/target/change_request_list.do.
INFO:root:... Setting up default view for list incident
INFO:root:...... Fetching default view for list incident...
INFO:root:...... Fetching existing columns for default view of list incident...
INFO:root:...... Deleting existing columns for default view of list incident...
INFO:root:...... Adding expected columns to default view of list incident...
INFO:root:......... number
INFO:root:......... caller_id
INFO:root:......... category
INFO:root:......... priority
INFO:root:......... impact
INFO:root:......... state
INFO:root:......... short_description
INFO:root:......... assigned_to
INFO:root:......... company
INFO:root:......... sys_created_on
INFO:root:...... Done.
INFO:root:An error occurred. Retrying...
INFO:root:URL login enabled.
INFO:root:Setting default home page
INFO:root:Guided tours disabled.
INFO:root:Analytics popups disabled.
INFO:root:Welcome help popup disabled.
INFO:root:Installing custom UI themes...
INFO:root:Uploading update set...
INFO:root:Applying update set...
INFO:root:An error occurred. Retrying...
INFO:root:URL login enabled.
INFO:root:Setting default home page
INFO:root:Guided tours disabled.
INFO:root:Analytics popups disabled.
INFO:root:Welcome help popup disabled.
INFO:root:Installing custom UI themes...
INFO:root:Uploading update set...
INFO:root:Applying update set...
Traceback (most recent call last):
  File "/Users/nidhirbhavsar/.pyenv/versions/workarena_env/bin/workarena-install", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/browsergym/workarena/install.py", line 1075, in main
    setup()
  File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/tenacity/__init__.py", line 336, in wrapped_f
    return copy(f, *args, **kw)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/tenacity/__init__.py", line 475, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/tenacity/__init__.py", line 376, in iter
    result = action(retry_state)
             ^^^^^^^^^^^^^^^^^^^
  File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/tenacity/__init__.py", line 418, in exc_check
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/tenacity/__init__.py", line 185, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/lib/python3.12/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/tenacity/__init__.py", line 478, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/browsergym/workarena/install.py", line 1025, in setup
    setup_ui_themes()
  File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/browsergym/workarena/install.py", line 837, in setup_ui_themes
    _install_update_set(path=UI_THEMES_UPDATE_SET["update_set"], name=UI_THEMES_UPDATE_SET["name"])
  File "/Users/nidhirbhavsar/.pyenv/versions/3.12.0/envs/workarena_env/lib/python3.12/site-packages/browsergym/workarena/install.py", line 138, in _install_update_set
    update_set = table_api_call(
                 ^^^^^^^^^^^^^^^
IndexError: list index out of range 

(runtime: 35m 55.78s)

Any ideas for troubleshooting this?

Nid989 commented 6 days ago

@dangne We were able to track down the issue to an update that was made to playwright. We are still figuring out how to fix the issue, but in the meantime, downgrading to pip install playwright==1.44.0 should fix it (note that you will need Python <= 3.12).

Can you please give it a shot and let us know?

Has there been any fixes for this lately? I encountered the same issue, however, my system crashes with playwright=1.44.0. Had the same issue with the Webarena as well in the past and nothing except playwright>=1.47.0 w/ python3.12 works on my machine (strange right?!)

yeonjooooni commented 3 days ago

Same here! I've tried 1) python 3.13 + playwright 1.49.0 ⛔️ 2) python 3.12 + playwright 1.44.0 ⛔️ 3) python 3.12 + playwright 1.47.0 ✅ and third option worked as mentioned above!