nuprl / MultiPL-E

A multi-programming language benchmark for LLMs
https://nuprl.github.io/MultiPL-E/
Other
201 stars 38 forks source link

Local datasets are JSON arrays, not JSON-L #81

Closed JohnGouwar closed 1 year ago

JohnGouwar commented 1 year ago

In multipl_e/completions.py:from_local_dataset, the JSON file was read as if it were a JSON-L file (i.e. each object on a separate line); however, dataset_builder/prepare_prompts_json.py, and thus dataset_builder/all_prepare_prompts.py, produce JSON arrays, where each line is just a single field in an object. This change fixes the interface mismatch.