allenai / ScienceWorld

ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.
https://sciworld.apps.allenai.org/
Apache License 2.0
223 stars 26 forks source link

[Bug] Mismatch between env.get_valid_action_object_combinations() and env.get_possible_actions() #75

Closed nuster1128 closed 1 month ago

nuster1128 commented 1 month ago

Thanks for your wonderful simulator!

However, during my experiments, I find a mismatch bug:

When I use env.get_valid_action_object_combinations() to list all the possible operations, I get

[..., 'focus on orange', 'go to door to foundry', 'go to door to greenhouse', ...]

However, when I use env.get_possible_actions() to list all the possible actions, I get

[..., 'focus on OBJ', 'go OBJ', 'inventory', 'look around', ...]

I find that 'go OBJ' mismatches with ['go to door to foundry', 'go to door to greenhouse', ...], where a to is missing. It makes my generated operations (action + OBJ) possibly not match with the available operations.

Moreover, it shows go to OBJ in the paper.

Sincerely.

PeterAJansen commented 1 month ago

Hi @nuster1128 , thanks for your note, and for using ScienceWorld!

I think the issue that you're reporting is that the high-level templates you're seeing from the template API function get_possible_actions() don't always match the valid action list from the valid actions function (get_valid_action_object_combinations())? With an example of this being go OBJ from the template function, and go *to* door to foundry from the valid action enumeration function?

Thanks for noting this -- one of the challenges is that under the hood, each action (like go) has a lot of aliases, so that the parser can recognize more ways of a user asking for the same action. For example, the aliases for go are:

new ActionExprOR(List("go", "go through", "walk through", "move through", "go to", "walk to", "move to", "go into", "move into")),

From: https://github.com/allenai/ScienceWorld/blob/c5e7187f745503751f323e80aa44373acd5451f8/simulator/src/main/scala/scienceworld/actions/ActionMoveToLocation.scala#L79

Because the typical API function from past environments returns only a single canonical template (e.g. go OBJ) rather than a list of aliases (e.g. ["go OBJ", "go through OBJ", "walk through OBJ", ...]), scienceworld kept that same API signature. A possible work-around would have been for the valid action enumeration to enumerate over all action verb aliases -- and I think I actually may have had that in there for an earlier version, but it makes the (already truly huge) list even larger, so I think I took that out during development, and only enumerate the aliases for the object names (which can also be large).

Thankfully there's a very simple solution for this -- I'd just implement a simple stop-word filter (e.g. 'to', 'through', 'into', etc), and then do string matching on that. Alternatively, since the templates should be the same across every scienceworld task, and the number of templates is small, you could just manually wrap your env.get_possible_actions() calls with a function that replaced (e.g.) go OBJ with go to OBJ. Either of those should be a quick fix to get you running. (An alternate would be for us to change this behavior in the next version, but I'm slightly hesitant to do this, since it could be a breaking change for existing agents).