qcri / LLMeBench

Benchmarking Large Language Models
80 stars 16 forks source link