Config - Are the logit_bias strings tokenized?

LagPixelLOL / ChatGPTCLIBot

ChatGPT Bot in CLI with long term memory support using Embeddings.

MIT License

340 stars 38 forks source link

Config - Are the logit_bias strings tokenized? #18

Closed soccerwithag closed 1 year ago

soccerwithag commented 1 year ago

For OpenAI's API, logit_bias accepts a json object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. I think words need tokenized to be passed as logit bias effectively? The config guide here has words, not tokens, so was confused. Sorry if I'm being dumb here, and thanks for this tool!

https://platform.openai.com/docs/api-reference/completions/create#completions/create-logit_bias

LagPixelLOL commented 1 year ago

If you look at the code here, they are all tokenized before sending them to the API. I automated this process so it's easier to use.

soccerwithag commented 1 year ago

I came back to comment, I just found it in the code too! Awesome, thank you for this project.