mlabonne / llm-autoeval

Automatically evaluate your LLMs in Google Colab
MIT License
460 stars 77 forks source link

Use lighteval for benchmarking #19

Closed burtenshaw closed 3 months ago

burtenshaw commented 4 months ago

This PR implements a benchmarking with lighteval within the Colab & Runpod configuration.

There is still some remaining work to unify the code base:

burtenshaw commented 4 months ago

@mlabonne It would be really cool to contribute this. Let me know what you think about integrating it with the existing benchmarks.