scicode-bench / SciCode

A benchmark that challenges language models to code solutions for scientific problems
Apache License 2.0
83 stars 8 forks source link

One click run #6

Open ofirpress opened 2 months ago

ofirpress commented 2 months ago

We should have a simple to run one line command that will run the benchmark and compute the results end-to-end.

Need to solve dependency issue for this to work #2