Possibility copying or running multiple session

web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

https://webarena.dev

Apache License 2.0

647 stars 94 forks source link

Possibility copying or running multiple session #68

Closed zijianma17 closed 3 months ago

zijianma17 commented 8 months ago

Hi thanks for your amazing environment. I am wondering is there possiblity that maybe a env can be replicated in a script? The reason why I want to try this is the actree-id changed every time. So maybe copy a env for the current state and do some test on it won't influnce the original one, which could be much of convienence. I tried in 2 ways.

copy.deepcopy(env), got the error below. TypeError: cannot pickle '_queue.SimpleQueue' object.

simply assign a new env and then using reset, also doesn't work because:

playwright._impl._api_types.Error: It looks like you are using Playwright Sync API inside the asyncio loop.
Please use the Async API instead.

Is there any simple functions to overcome the problem I met? Or some instructions from you will also be very helpful. Thanks a lot!

shuyanzhou commented 8 months ago

If you want to have a fixed axtree node - ID mapping given the current observation, one possible thing to try is to assign the IDs in the order of the tree. For example:

[164] textbox 'Search' focused: True required: False
[171] button 'Go'
[174] link 'Find directions between two points'

becomes

[99] ...
[100] textbox 'Search' focused: True required: False
[101] button 'Go'
[102] link 'Find directions between two points'
[103] ....

then you create a mapping between the underlying node ID and the displayed ID. The relevant code is here.

Do you think this will resolve your problem?

zijianma17 commented 8 months ago

If you want to have a fixed axtree node - ID mapping given the current observation, one possible thing to try is to assign the IDs in the order of the tree. For example:
[164] textbox 'Search' focused: True required: False
[171] button 'Go'
[174] link 'Find directions between two points'
becomes
[99] ...
[100] textbox 'Search' focused: True required: False
[101] button 'Go'
[102] link 'Find directions between two points'
[103] ....
then you create a mapping between the underlying node ID and the displayed ID. The relevant code is here.

Do you think this will resolve your problem?

Thanks for your reply! This could alleviate my problem a lot. I will check the code. Although I still want to know if there is possible ways to replicate the env, because it may fundamentally prevent the problem I'm encountering.

shuyanzhou commented 7 months ago

Not sure if I understand your need completely, but I think that copying an environment may not be feasible since interactions with WebArena websites can cause irreversible changes, such as deleting a post.

zijianma17 commented 7 months ago

The reason why I want to copy a env is for (possibly wrong)tests, so that the original one keep in its current situation. An example could be: There are 3 repositories in a gitlab page, i.e. A, B, C. I need to go to A to complete the task but the program may choose the wrong one. Doing a test and verify it on a copied env before execute the action in the real one could be very helpful. But never mind, I just try to make it clear what I meant. I will test the method you provided. Regarding "deleting a post", this raise me a new question: If I logged in e.g. reddit and deleted a post. How to get the env recover from this? Maybe reloading the docker file?

shuyanzhou commented 7 months ago

Thanks for the clarification.

There are 3 repositories in a gitlab page, i.e. A, B, C. I need to go to A to complete the task but the program may choose the wrong one. Doing a test and verify it on a copied env before execute the action in the real one could be very helpful.

If I understand correctly, you are trying to do some search on a few possible actions. I can think of two approaches:

execute the action, and if it is wrong, perform go_back()
open a new tab with the corresponding link e.g.,new_tab(), goto(repo_a)
These should work for read-only actions (e.g., go to a new page) which does not change the state of the environment.

Regarding "deleting a post", this raise me a new question: If I logged in e.g. reddit and deleted a post. How to get the env recover from this? Maybe reloading the docker file?

You are correct. After each round of evaluation, we will reset the docker to its original state.

shuyanzhou commented 3 months ago

Feel free to reopen if you have any questions in the future.