huggingface / lighteval

LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
MIT License
471 stars 55 forks source link

Add HumanEval and HumanEval+ #63

Open lewtun opened 4 months ago

lewtun commented 4 months ago

The HumanEval and HumanEval+ benchmarks are stables for benchmarking code capabilities of base LLMs. It would be nice to include them in lighteval so one doesn't have to switch to another framework like BigCode's

References:

0-hero commented 3 months ago

+1, would be nice to have