scicode-bench / SciCode

A benchmark that challenges language models to code solutions for scientific problems
Apache License 2.0
83 stars 8 forks source link

Inquiry on Skipping Specific Problems in SciCode Benchmark #1

Closed HariSeldon0 closed 2 months ago

HariSeldon0 commented 3 months ago

Thank you for your excellent work on "SciCode: A Research Coding Benchmark Curated by Scientists".

I am currently reproducing your work using the provided code and noticed that some problems are skipped in the following section:

        ... ...
        for i in range(steps):
            if (prob_id == "13" and i == 5) or (prob_id == "62" and i == 0)\
                    or (prob_id == "76" and i == 2):
                continue
            gcode.generate_response_with_steps(problem, i + 1, steps, model)
        ... ...

Specifically, problems 13.5, 62.0, and 76.2 are being skipped. Could you please explain the reason for this?

Your clarification would be greatly appreciated.

mtian8 commented 2 months ago

The skipped subproblems are directly given as part of the question design (These subproblems can be found here: https://github.com/scicode-bench/SciCode/tree/main/eval/data).