clp-research / clembench

A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark
MIT License
19 stars 26 forks source link

remove GPT4 gymnastics from `pipeline_clembench.sh` #8

Closed davidschlangen closed 8 months ago

davidschlangen commented 8 months ago

GPT4 gets special treatment in the script, for historical reasons (at least that's me explanation; as the main API key didn't have access to it at the time). This isn't necessary, and should be consolidated so that running the benchmark really is just one call.