mazzzystar / TurtleBenchmark

Benchmark for LLM Reasoning & Understanding with Challenging Tasks from Real Users.
https://mazzzystar.github.io/2024/08/09/turtle-benchmark-zh/
101 stars 7 forks source link