Closed bkowshik closed 1 year ago
Thanks, I haven't released a version with it yet, will do so soon. It's visible in the repo here, so if you install from source: https://github.com/openai/tiktoken/blob/main/tiktoken/_educational.py
Sorry for hijacking the issue, but is there any education, description about the token - code pairs? Particularly: I would like to add logit bias to the stop token, increasing the likelihood of shorter or longer answers for gpt-3.5/4, but I haven't found any way to figure out how the stop token is encoded. Maybe an example or a description about that could be added to the educational submodule.
@dvolgyes
The <|im_end|>
token is used to mark the end of message. In our case it marks the end of assistant message. In the extending tiktoken code snippet it can be seen that token 100265
is what it's mapped to. However the API doesn't allow logits for tokens higher than 100257.
@microsoftbuild Thanks for the explanation! Too bad, it would be a neat way to steer the model.
This was released in 0.5.0
Following the README, https://github.com/openai/tiktoken/blob/main/README.md I tried the following which did not work.
Version:
tiktoken==0.4.0