clp-research / clembench

A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark
MIT License
22 stars 31 forks source link

[referencegame] regex fix #73

Closed AnneBeyer closed 5 months ago

AnneBeyer commented 6 months ago

Here is the instances.json with the (hopefully) fixed regex. I didn't adapt the parsing yet, as this may change the scores and would then not be compatible to the other v1.5 results.

AnneBeyer commented 6 months ago

The last commits now also contain the strict mode parsing changes.