lupantech / MathVista

MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts
https://mathvista.github.io/
Creative Commons Attribution Share Alike 4.0 International
246 stars 37 forks source link

Inefficient file write operations - writing entire results dictionary to output path in loops #17

Closed mattmazzola closed 2 months ago

mattmazzola commented 9 months ago

More of an optimization rather than bug or issue with evaluation, but I think worth noting in case someone thinks it is worthy to address.

generate_response.py and extract_answer.py use an inefficient pattern of saving the entire dictionary to output .json file at different iterations of the loops

generate_response - Every iteration extract_answer - Every n (default of 10) iterations

References:

https://github.com/lupantech/MathVista/blob/82f68d09b4cbffe9d0dfd7542c599810e30c9a99/evaluation/generate_response.py#L212-L214

    for _, pid in enumerate(tqdm(test_pids)):
            ...
            save_json(results, output_file)

https://github.com/lupantech/MathVista/blob/82f68d09b4cbffe9d0dfd7542c599810e30c9a99/evaluation/extract_answer.py#L146-L149

Given the size of testmini is 1000 and test is 5141 with non-trivial amounts of text per item, this cost may significant (It would be nice to know estimate of full size, although I didn't generate the entire set yet)

Each file currently has this format:

{
    "1": {
        "question": "When a spring does work on an object, we cannot find the work by simply multiplying the spring force by the object's displacement. The reason is that there is no one value for the force-it changes. However, we can split the displacement up into an infinite number of tiny parts and then approximate the force in each as being constant. Integration sums the work done in all those parts. Here we use the generic result of the integration.\r\n\r\nIn Figure, a cumin canister of mass $m=0.40 \\mathrm{~kg}$ slides across a horizontal frictionless counter with speed $v=0.50 \\mathrm{~m} / \\mathrm{s}$. It then runs into and compresses a spring of spring constant $k=750 \\mathrm{~N} / \\mathrm{m}$. When the canister is momentarily stopped by the spring, by what distance $d$ is the spring compressed?",
        "image": "images/1.jpg",
        "choices": null,
        "unit": null,
        "precision": 1,
        "answer": "1.2",
        "question_type": "free_form",
        "answer_type": "float",
        "pid": "1",
        "metadata": {
            "split": "testmini",
            "language": "english",
            "img_width": 1514,
            "img_height": 720,
            "source": "SciBench",
            "category": "math-targeted-vqa",
            "task": "textbook question answering",
            "context": "scientific figure",
            "grade": "college",
            "skills": [
                "scientific reasoning"
            ]
        },
        "caption": "The image shows a free body diagram of a spring-mass system. The system consists of a spring, a mass, and a surface. The spring is attached to the mass and the surface. The mass is displaced to the right of its equilibrium position. The spring force is pointing to the left, opposite the displacement of the mass. The text in the image says \"The spring force does negative work, decreasing speed and kinetic energy.\"\n\nHere is a more detailed description of the objects and relationships in the image:\n\n* **Spring:** The spring is a long, flexible object that can be stretched or compressed. It exerts a force that is proportional to its displacement from its equilibrium position.\n* **Mass:** The mass is a small object that has a certain amount of inertia. Inertia is the tendency of an object to resist changes in its motion.\n* **Surface:** The surface is a flat object that the mass is resting on. The surface is frictionless, which means that there is no friction between the mass and the surface.\n* **Spring force:** The spring force is the force that the spring exerts on the mass. The spring force is directed towards the spring's equilibrium position.\n* **Displacement:** The displacement of the mass is the distance between the mass's current position and its equilibrium position.\n* **Negative work:** Negative work is work that decreases the energy of a system. In this case, the spring force does negative work on the mass, which decreases the mass's kinetic energy.\n* **Decreasing speed and kinetic energy:** The spring force causes the mass to slow down, which decreases its kinetic energy. Kinetic energy is the energy of motion.",
        "ocr": "[([161, 39], 'The spring force does'), ([158, 104], 'negative work; decreasing'), ([154, 197], 'speed and kinetic energy:'), ([316, 304], 'k'), ([812, 378], 'Frictionless'), ([1186, 378], 'mL'), ([473, 569], 'd'), ([631, 643], 'First touch'), ([240, 638], 'Stop')]",
        "query": "Question: How much money does Ruth need to buy a baking dish, a casserole dish, and an ice cream scoop? (Unit: $)\nImage description: The image shows a table with a variety of items on it, including a baking dish, ice cream scoop, casserole dish, and rolling pin. The text in the image says:\n\n```\nbaking dish\n$4.00\nice cream scoop\n$6.00\ncasserole dish\n$3.00\nrolling pin\n$4.00\n```\nImage detected text: [([5, 3], 'baking dish'), ([177, 5], '$4.00'), ([7, 41], 'ice cream scoop'), ([177, 37], '$6.00'), ([9, 69], 'casserole dish'), ([177, 69], '$3.00'), ([5, 98], 'rolling pin'), ([177, 101], '$4.00')]\nPython code: baking_dish_price = 4.00\ncasserole_dish_price = 3.00\nice_cream_scoop_price = 6.00\n\nans = baking_dish_price + casserole_dish_price + ice_cream_scoop_price\nprint(ans)\n\nQuestion: What is the largest city in the nation where this plane is headquartered?\nChoices:\n(A) hong kong\n(B) osaka\n(C) shanghai\n(D) tokyo\nImage description: The image shows a large passenger jet parked on a tarmac at an airport. The jet is white with red trim and has a red tail. It is sitting on top of a tarmac next to a building. The jet is being loaded with passengers and cargo. The text on the image says \"Japan. Endless Discovery\".\nPython code: def largest_city(caption, choices):\n    countries_largest_cities = {\n        'Japan': 'tokyo',\n        'China': 'shanghai'\n    }\n\n    if \"Japan\" in caption:\n        country = 'Japan'\n    elif \"China\" in caption:\n        country = 'China'\n\n    for choice in choices:\n        if choice == countries_largest_cities[country]:\n            return choice\n    return \"\"\n\nchoices = ['hong kong', 'osaka', 'shanghai', 'tokyo']\ncaption = \"The image shows a large passenger jet parked on a tarmac at an airport. The jet is white with red trim and has a red tail. It is sitting on top of a tarmac next to a building. The jet is being loaded with passengers and cargo. The text on the image says 'Japan. Endless Discovery'.\"\n\nprint(largest_city(caption, choices))\n\nQuestion: When a spring does work on an object, we cannot find the work by simply multiplying the spring force by the object's displacement. The reason is that there is no one value for the force-it changes. However, we can split the displacement up into an infinite number of tiny parts and then approximate the force in each as being constant. Integration sums the work done in all those parts. Here we use the generic result of the integration.\r\n\r\nIn Figure, a cumin canister of mass $m=0.40 \\mathrm{~kg}$ slides across a horizontal frictionless counter with speed $v=0.50 \\mathrm{~m} / \\mathrm{s}$. It then runs into and compresses a spring of spring constant $k=750 \\mathrm{~N} / \\mathrm{m}$. When the canister is momentarily stopped by the spring, by what distance $d$ is the spring compressed?\nImage description: The image shows a free body diagram of a spring-mass system. The system consists of a spring, a mass, and a surface. The spring is attached to the mass and the surface. The mass is displaced to the right of its equilibrium position. The spring force is pointing to the left, opposite the displacement of the mass. The text in the image says \"The spring force does negative work, decreasing speed and kinetic energy.\"\n\nHere is a more detailed description of the objects and relationships in the image:\n\n* **Spring:** The spring is a long, flexible object that can be stretched or compressed. It exerts a force that is proportional to its displacement from its equilibrium position.\n* **Mass:** The mass is a small object that has a certain amount of inertia. Inertia is the tendency of an object to resist changes in its motion.\n* **Surface:** The surface is a flat object that the mass is resting on. The surface is frictionless, which means that there is no friction between the mass and the surface.\n* **Spring force:** The spring force is the force that the spring exerts on the mass. The spring force is directed towards the spring's equilibrium position.\n* **Displacement:** The displacement of the mass is the distance between the mass's current position and its equilibrium position.\n* **Negative work:** Negative work is work that decreases the energy of a system. In this case, the spring force does negative work on the mass, which decreases the mass's kinetic energy.\n* **Decreasing speed and kinetic energy:** The spring force causes the mass to slow down, which decreases its kinetic energy. Kinetic energy is the energy of motion.\nImage detected text: [([161, 39], 'The spring force does'), ([158, 104], 'negative work; decreasing'), ([154, 197], 'speed and kinetic energy:'), ([316, 304], 'k'), ([812, 378], 'Frictionless'), ([1186, 378], 'mL'), ([473, 569], 'd'), ([631, 643], 'First touch'), ([240, 638], 'Stop')]\nHint: Please generate a python code to solve the problem\nPython code:",
        "response": "import math\n\n# Given values\nm = 0.40  # mass in kg\nv = 0.50  # speed in m/s\nk = 750  # spring constant in N/m\n\n# The kinetic energy of the canister is converted into potential energy in the spring\n# The equation for kinetic energy is KE = 1/2 * m * v^2\n# The equation for potential energy in a spring is PE = 1/2 * k * d^2\n# Setting these equal to each other and solving for d gives:\n\nd = math.sqrt((m * v**2) / k)\nprint(d)",
        "execution": "0.011547005383792516",
        "error": "None",
        "extraction": "0.40"
    },
    "2": {
    }
}

Solution 1:

Apply the save_every pattern from extract_answer to generate_response https://github.com/lupantech/MathVista/blob/82f68d09b4cbffe9d0dfd7542c599810e30c9a99/evaluation/extract_answer.py#L98

Solution 2:

Given the only purpose of the write operation is to add new result or extraction to the file, a possible solution could be to use .jsonl output and only write this new line. This is more ideal or optimal, but requires more work since it also changes how files would be read in subsequent steps.

Also, given that extract_answer uses a pattern of "Updating" an existing file instead of outputting a new file, this could be changed to write extractions with pid to separate file.

lupantech commented 2 months ago

Thank you, @mattmazzola! We’ve approved your pull request, which resolves the issue: PR #22.