Closed Muennighoff closed 1 year ago
That makes sense, how do you suggest including it? Add an arg to the task class or output the name of the metric after calling it in postprocess_results
for example?
Yeah I would return the metric name that we provide to evaluate.load
, maybe sth like:
code_metric = load("code_eval")
results, _ = code_metric.compute(
references=references,
predictions=generations,
)
return {
"code_eval": results,
}
would be better to also have the metric imo