Jellyfish042 / uncheatable_eval

Evaluating LLMs with Dynamic Data
MIT License
66 stars 4 forks source link