LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
AGIEval is a popular set of benchmarks that was popularised by Teknium/Nous in models like OpenHermes. It would be nice to include in lighteval so we can compare internally how our models stack up on this benchmark :)
AGIEval is a popular set of benchmarks that was popularised by Teknium/Nous in models like OpenHermes. It would be nice to include in
lighteval
so we can compare internally how our models stack up on this benchmark :)Ref command from AutoEval: