allenai / ScienceWorld

ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.
https://sciworld.apps.allenai.org/
Apache License 2.0
213 stars 26 forks source link

Tasks for measuring the melting point do not report the measurement unit #20

Closed manuelciosici closed 1 year ago

manuelciosici commented 2 years ago

Here's a typical task to measure the melting point of a material:

Your task is to measure the melting point of lead, which is located around the kitchen. First, focus on the thermometer. Next, focus on the lead. If the melting point of lead is above 150.0 degrees, focus on the yellow box. If the melting point of lead is below 150.0 degrees, focus on the purple box. The boxes are located around the kitchen.

Lead's melting point is 621.5°F or 327.5°C, so the measurement unit does not matter for this particular task instance since both the F and C values are above the 150 threshold. I have not checked all the materials in SW, but I can imagine that the difference between Fahrenheit and Celsius will matter for some task variations.

ScienceWorld should specify the units of measurement both in task descriptions and when using a thermometer. It already takes a stance on reporting the unit as Celsius when using a thermometer (e.g., use thermometer in inventory on apple juicethe thermometer measures a temperature of 8 degrees celsius).

Without specifying the unit of measure in task descriptions, agents can't accomplish the vision outlined in section 5.3 of the ScienceWorld paper, which states that some tasks can be solved by knowing facts (e.g., that a metal fork is an electrical conductor or in this case, lead's melting point), while other task variations require experimentation. Screen Shot 2022-08-17 at 12 24 35

I can make a PR for measurement units once this is confirmed as an issue.

manuelciosici commented 2 years ago

@rajammanabrolu

MarcCote commented 2 years ago

@manuelciosici thank you for reporting this. ScienceWorld is adhering to the SI units and we should make it clear in the task description and observation returned by the environment.