mazzzystar / TurtleBench

TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles.
https://arxiv.org/abs/2410.05262
Apache License 2.0
125 stars 9 forks source link

refactor: restructure all code for TurtleBench project #7

Closed Duguce closed 1 month ago