web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
https://webarena.dev
Apache License 2.0
760 stars 121 forks source link

Collecting Human Trajectories Using Playwright #172

Open hsiachi opened 2 months ago

hsiachi commented 2 months ago

Hello,

I am currently working on a project that involves the collection of Human Trajectories, and I noticed your project utilizes Playwright for this purpose. I have a few questions regarding how you achieved this:

From my understanding, Playwright’s official tools typically create a new Chromium interface where the user can perform actions, and the tool records those actions. However, it seems that while it captures user operations, it doesn’t record the page information itself in real-time. The page details can be obtained upon replaying the actions, but I am curious if there is a method to record both the user actions and the page information simultaneously during the interaction.

Could you please share some insights or suggestions on how to approach this? Any guidance would be greatly appreciated!

Thank you for your time and this wonderful project!

shuyanzhou commented 2 months ago

Hi,

Have you look into the zip file you obtain after recording the trace? It should record the raw HTMLs which you can inspect with playwright viewer. Just to note that playwright can be quite noisy though.