Feat/results by pairing

clp-research / clembench

A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark

MIT License

19 stars 26 forks source link

Closed phisad closed 5 months ago

phisad commented 5 months ago

store results now by pairings instead of games