best way to extract action sequence from execution trace?

web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

Apache License 2.0

700 stars 108 forks source link

Thanks for this awesome benchmark!

I'm trying to parse and reexecute action sequences from an execution trace, e.g. take render_0.html the 919_gpt4_8k_cot traces, parse the action sequence proposed by the LLM, and verify whether this action sequence solves the task following the logic used in minimal_example.py.

Is there a way to get a list of Action instances representing the actions taken by the LLM in its execution? To do this, I was thinking of parsing it out of the render_0.html file, but I don't see how to parse an Action object without writing some janky parsing logic of my own — doable, but seems like the wrong way to go about things.

You have guidance on parsing the html from render_{i}.html here: https://github.com/web-arena-x/webarena/blob/main/resources/README.md#render_html

but I don't see how to parse an Action object (https://github.com/web-arena-x/webarena/blob/main/browser_env/actions.py#L94) out of the html without writing a parser on my own.

I saw there is a function parse_playwright_code for parsing playwright format actions, so I thought maybe that's the better way to reproduce the action sequences, but I when I unzip the 919_gpt4_8k_cot folder, there is no trace/ folder to be found.

from webarena.browser_env import create_id_based_action from webarena.browser_env.actions import create_id_based_action from webarena.agent import construct_agent from webarena.run import config, prepare from bs4 import BeautifulSoup args = config() args.model = 'gpt-4' args.instruction_path = 'webarena/agent/prompts/jsons/p_cot_id_actree_2s.json' prepare(args) agent = construct_agent(args) with open('919_gpt4_8k_cot/render_0.html', 'r') as f: content = f.read() soup = BeautifulSoup(content, 'html.parser') raw_predictions = soup.find_all("div", {"class": "raw_parsed_prediction"}) actions = [agent.prompt_constructor.extract_action(p.pre.text) for p in raw_predictions] actions = [create_id_based_action(a) for a in actions]

web-arena-x / webarena

best way to extract action sequence from execution trace? #99