clp-research / clembench

A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark
MIT License
19 stars 26 forks source link

add workflow documentation for adding to leaderboard #14

Closed davidschlangen closed 4 months ago

davidschlangen commented 8 months ago

We need a document that details all the steps that need to be taken to add a new line to the leaderboard (= run a new model, or run a model again).

See the versioning_running.md planning document: Changes in the set of games constitute an increment in the major version number of the benchmark / leaderboard; changes to the instances for a set of game constitute an increment in the minior version.

davidschlangen commented 5 months ago

This was done, right? @sherzod-hakimov

sherzod-hakimov commented 4 months ago

yes, there this document written for it: https://github.com/clp-research/clembench/blob/main/docs/howto_benchmark_workflow.md