clp-research / clembench

A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark
MIT License
22 stars 31 forks source link

[backends] Benchmark configuration file to change settings such as openai-compatible API backend #112

Open Gnurro opened 1 month ago

Gnurro commented 1 month ago

Currently the retry values like number of tries before abort and delay before retrying are part of the openai-compatible API backend code. However, different providers have different limits, for different models (groq, for example), so it would be good to have the retry behavior set in an external file instead of code that should be kept static between different projects and experiments. The retry library (https://github.com/invl/retry) allows for more dynamic retryng as well, so a retry configuration could handle different return codes/payloads differently, allowing to dynamically adjust retrying to the provider/API limits/model.

sherzod-hakimov commented 1 month ago

Maybe it makes sense to have a configuration file for each benchmark run that includes info such as

Gnurro commented 1 month ago

Right, one consolidated configuration file for a specific benchmark run could hold these kinds of settings as well. It might also allow for specifc pre-set combinations of instances and models as I've seen with student projects using clembench - these are currently done as shell scripts, which some potential users might not be familiar with.