web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
https://webarena.dev
Apache License 2.0
647 stars 94 forks source link

How to transfer nodes to elements that can be use to click from accessibility tree? #97

Closed HarryZhou-618 closed 5 months ago

HarryZhou-618 commented 5 months ago

webarena is really a great job! But I encountered a problem when processing the obtained ax tree. How can I convert the AX node into an operable DOM element so that I can click or type something? It seems that I only have the backendnodeid of this node, I tried to get it's nodeId using the cdp DOM.describeNode, but the return value nodeId is 0, which bothers me. Not sure how you manipulate these elements? Looking forward to your reply~

shuyanzhou commented 5 months ago

Hi @HarryZhou-618 sorry for the late reply.

The backend node ID comes with the accessibility tree, check if this line is something you are looking for?

We implement operating different nodes by (1) identifying the bounding box of each node and (2) apply the action to the bounding box. Here is the code to get the bounding boxes for each node, and here is an example of applying type action to the node -- it first clicks the center of the bounding box and then types the content.

Let me know if these answer your questions, feel free to follow up.

HarryZhou-618 commented 5 months ago

Thanks for your reply! This solved my problem!