clp-research / clembench

A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark
MIT License
22 stars 31 forks source link

Local backends fix #18

Closed Gnurro closed 10 months ago

Gnurro commented 10 months ago

Fix to properly cull prompt from model output. The old code uses tokenizer.apply_chat_template(messages, tokenize=False to get the prompt with chat formatting as string - but this method call does not return the actual decoded result, but one that is missing whitespaces, thus preventing string match for replacement. Decoding the actual tokenized prompt instead solves this issue.

Gnurro commented 10 months ago

As a note: The issue occurs with the Llama2-chat tokenizer - it does not with the ~10 other tokenizers I've tested this with, hence the oversight. Changed it for the HF backend as well to prevent this occuring if a model/tokenizer added in the future has the same issue.