replit / ReplitLM

Inference code and configs for the ReplitLM model family

https://huggingface.co/replit

Apache License 2.0

937 stars 83 forks source link

Adding evaluation with BigCode harness and patched script #25

Closed madhavatreplit closed 1 year ago

madhavatreplit commented 1 year ago

Why

We want to support evaluation for ReplitLM models with the bigcode-evaluation-harness.

What changed

We add an evaluation/ directory with code to run evaluation with the harness:

evaluation/README.md: instructions and steps on how to setup and run evaluation
evaluation/eval.py: a fork of the original harness main.py with patches for tokenizer decoding and flash attention
evaluation/scripts: bash scripts to help parameterize and run eval.py.

Test plan

Ran and reproduced numbers.

Rollout

[x] This is fully backward and forward compatible