clp-research / clembench

A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark
MIT License
19 stars 26 forks source link

improve tokenizer in taboo #9

Closed davidschlangen closed 4 months ago

davidschlangen commented 8 months ago

I just saw a game where the clue

A term that refers to the act of completing a task without professional help

allegedly contains the taboo words "diy, do, it, yourself". My only explanation is that "without" created a hit for "it" -- which is not what the rules should be. The test should be for there being a token, not for string inclusion.

phisad commented 4 months ago

Fixed with https://github.com/clp-research/clembench/pull/57 (see test_clue_check_issue9 test case)