[FR] include "config" data in generations_only

feature proposal/request to include the "config" args when doing generations_only

Motivation

I run generations on several remote systems and then run the evaluations locally on my workstation. Currently I have to keep track of my exact parameters for the generations by the filename alone. And then run a really long command as well with all the same parameters for the evals.

pros

never mix up runs
don't use filenames for data
could be much easier to run eval only (just have to give it a .json that already includes the task, limit, model, etc)

cons

likely breaks some existing downstream scripts
tiny bit of redundant data

relevant code

It seems to be the case that in generation_only mode, you only return save_generations. So the args are lost. https://github.com/bigcode-project/bigcode-evaluation-harness/blob/1b0147c50f406ff66ac4f806230479f31d19c7e6/main.py#L400-L408 the generations.json is just a list of list of strings. But it could easily hold the same config as the eval_results.json. You would also need a bit of code to read these args in eval_only mode. Also in the case of there being a crash during the eval run - you will have generations but no eval results saved.

Alternatively there could be a config.json file that keeps track of these when in generation_only (or an empty eval_results)... but that still leaves you with multiple files - instead of having all of it contained at once.

bit of an RFC, before I try and implement this in a PR myself. Especially on how to make it compatible with existing formats

bigcode-project / bigcode-evaluation-harness

[FR] include "config" data in generations_only #226

Motivation

relevant code