web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
https://webarena.dev
Apache License 2.0
708 stars 110 forks source link

Is "image" oberservation_type supported? #41

Closed Jiayi-Pan closed 1 year ago

Jiayi-Pan commented 1 year ago

Thank you for this great project? I am playing with the browser_env you created, and the program would fail when observation_type="image"

Following is the script to reproduce and the error message I got

import random
from browser_env import ScriptBrowserEnv, create_id_based_action
# init the environment
env = ScriptBrowserEnv(
    headless=False,
    observation_type="image",
    current_viewport_only=True,
    viewport_size={"width": 1280, "height": 720},
)
# prepare the environment for a configuration defined in a json file
# config_file = "config_files/0.json"
config_file = "config_files/examples/2.json"
obs, info = env.reset(options={"config_file": config_file})
# get the text observation (e.g., html, accessibility tree) through obs["text"]

# create a random action
id = random.randint(0, 1000)
action = create_id_based_action(f"click [0]")

# take the action
obs, _, terminated, _, info = env.step(action)
Traceback (most recent call last):
  File "/Users/jiayipan/code/webarena/demo.py", line 13, in <module>
    obs, info = env.reset(options={"config_file": config_file})
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<@beartype(browser_env.envs.ScriptBrowserEnv.reset) at 0x10fb00400>", line 51, in reset
  File "/Users/jiayipan/code/webarena/browser_env/envs.py", line 216, in reset
    observation = self._get_obs()
                  ^^^^^^^^^^^^^^^
  File "<@beartype(browser_env.envs.ScriptBrowserEnv._get_obs) at 0x10fb000e0>", line 10, in _get_obs
  File "/Users/jiayipan/code/webarena/browser_env/envs.py", line 177, in _get_obs
    obs = self.observation_handler.get_observation(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<@beartype(browser_env.processors.ObservationHandler.get_observation) at 0x10fa858a0>", line 52, in get_observation
  File "/Users/jiayipan/code/webarena/browser_env/processors.py", line 656, in get_observation
    text_obs = self.text_processor.process(page, client)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<@beartype(browser_env.processors.TextObervationProcessor.process) at 0x10fa85300>", line 52, in process
  File "/Users/jiayipan/code/webarena/browser_env/processors.py", line 567, in process
    raise ValueError(
ValueError: Invalid observatrion type:
shuyanzhou commented 1 year ago

The image observation is available by default without setting the observation_type

obs, info = env.reset(options={"config_file": config_file})

img_obs = obs["image"]
# you can save the image as well
image = Image.fromarray(img_obs)
image.save('output_filename.png')

We will update to resolve this unintuitive design