bigcode-project / bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.
Apache License 2.0
771 stars 201 forks source link

Separate generation and evaluation + add CI #15

Closed loubnabnl closed 1 year ago

loubnabnl commented 1 year ago

This PR adds options to use the evaluation harness to do text generation only or evaluation only (on previously computed generations). It also adds a CI and tests for HumanEval and MBPP.

cc @ocramz