allenai / ScienceWorld

ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.
https://sciworld.apps.allenai.org/
Apache License 2.0
213 stars 26 forks source link

Feature reward #6

Closed PeterAJansen closed 2 years ago

PeterAJansen commented 2 years ago

step() has been changed to return reward (gained from that step) rather than absolute score.

Notes:

  1. The previous step() API was made to mirror TextWorld. This makes the API more closely mirror OpenAI Gym and Jericho.
  2. Absolute score (and reward) are still both available in the 'info' dictionary returned from step().
  3. This is likely a breaking change for anything currently using ScienceWorld, so we should perhaps bump the version higher than just rc3?

Example of running human.py example:

Gold Path:['open door to kitchen', 'go to kitchen', 'look around', 'focus on counter', 'move counter to red box'] Task Name: task-3-find-non-living-thing Variation: 0 / 300 Task Description: Your task is to find a(n) non-living thing. First, focus on the thing. Then, move it to the red box in the kitchen.

This room is called the hallway. In it, you see: the agent a picture a substance called air You also see: A door to the workshop (that is closed) A door to the art studio (that is closed) A door to the kitchen (that is closed) A door to the living room (that is closed) A door to the green house (that is closed) A door to the bedroom (that is closed) Reward: 0 Score: 0 isCompleted: False 'help' lists valid action templates, 'objects' lists valid objects, 'valid' lists valid action-object combinations (long!). 'goals' lists progress on subgoals. type 'exit' to quit.

take picture

You move the picture to the inventory. Reward: 0 Score: 0 isCompleted: False 'help' lists valid action templates, 'objects' lists valid objects, 'valid' lists valid action-object combinations (long!). 'goals' lists progress on subgoals. type 'exit' to quit.

focus on picture

You focus on the picture. Reward: 58 Score: 58 isCompleted: False 'help' lists valid action templates, 'objects' lists valid objects, 'valid' lists valid action-object combinations (long!). 'goals' lists progress on subgoals. type 'exit' to quit.

open kitchen door

The door is now open. Reward: 9 Score: 67 isCompleted: False 'help' lists valid action templates, 'objects' lists valid objects, 'valid' lists valid action-object combinations (long!). 'goals' lists progress on subgoals. type 'exit' to quit.

go kitchen

You move to the kitchen. Reward: 16 Score: 83 isCompleted: False 'help' lists valid action templates, 'objects' lists valid objects, 'valid' lists valid action-object combinations (long!). 'goals' lists progress on subgoals. type 'exit' to quit.

look

This room is called the kitchen. In it, you see: a stopwatch, which is deactivated. a freezer. The freezer door is closed. a table. On the table is: a glass cup (containing nothing). a substance called air a cupboard. The cupboard door is closed. a sink, which is turned off. In the sink is: nothing. the agent a painting a fridge. The fridge door is closed. a chair. On the chair is: nothing. a glass jar (containing a substance called sodium chloride) a substance called soap a stove, which is turned off. On the stove is: nothing. a red box (containing nothing) a oven, which is turned off. The oven door is closed. a thermometer, currently reading a temperature of 10 degrees celsius a lighter a counter. On the counter is: a drawer, a bowl (containing an orange, a red apple, a banana, a potato). You also see: A door to the bathroom (that is closed) A door to the outside (that is closed) A door to the hallway (that is open) Reward: 0 Score: 83 isCompleted: False 'help' lists valid action templates, 'objects' lists valid objects, 'valid' lists valid action-object combinations (long!). 'goals' lists progress on subgoals. type 'exit' to quit.

move picture in inventory to red box

You move the picture to the red box. Reward: 17 Score: 100 isCompleted: True 'help' lists valid action templates, 'objects' lists valid objects, 'valid' lists valid action-object combinations (long!). 'goals' lists progress on subgoals. type 'exit' to quit.