qcri / LLMeBench

Benchmarking Large Language Models
76 stars 15 forks source link