Inquiry on Skipping Specific Problems in SciCode Benchmark

scicode-bench / SciCode

A benchmark that challenges language models to code solutions for scientific problems

Apache License 2.0

83 stars 8 forks source link

Thank you for your excellent work on "SciCode: A Research Coding Benchmark Curated by Scientists".

I am currently reproducing your work using the provided code and noticed that some problems are skipped in the following section:

        ... ...
        for i in range(steps):
            if (prob_id == "13" and i == 5) or (prob_id == "62" and i == 0)\
                    or (prob_id == "76" and i == 2):
                continue
            gcode.generate_response_with_steps(problem, i + 1, steps, model)
        ... ...

Specifically, problems 13.5, 62.0, and 76.2 are being skipped. Could you please explain the reason for this?

Your clarification would be greatly appreciated.

scicode-bench / SciCode

Inquiry on Skipping Specific Problems in SciCode Benchmark #1