Closed Clement-Lelievre closed 11 months ago
This cannot be done naively using the logit bias parameter, for the reasons you mention. Although you could take a resampling approach to make it work. I think it'd be great if the API let you specify a grammar or state machine or something to have more control over sampling, but that's out of scope for this repo.
I need to use the openAI Chat endpoint and prevent some words from appearing in the model's completion. To achieve this, I use the
logit_bias
parameter, which expects a mapping of tokens to an int value (-100 in my case). This is where tiktoken comes in.PROBLEM: the forbidding mechanism described above works at token-level, not at word-level. Therefore, and because some tokens are shared across words, forbidding a token with a specific word in mind can result in unintended consequences, namely forbidding other words in the process.
EXAMPLE: I want the model's completion to NOT include the word "impossible". The tokens for this word as per
cl100k_base
are:[318, 10236]
. So I pass to my openai query:logit_bias = {318 : -100, 10236 : -100}
. However, because the token list for the word "possible" using the same encoder is :[10236]
this could result in an unwanted ban of the word "possible".