How to evaluate on Human-eval

salesforce / CodeGen

CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.

Apache License 2.0

4.94k stars 381 forks source link

How to evaluate on Human-eval #51

Closed Lifeasarain closed 1 year ago

Lifeasarain commented 1 year ago

I would like to evaluate the Codegen model on human-eval dataset. But I don't know how to generate 200 samples for each problem to calculate pass@k. Can you provide any documentation in this regard?