Token counting information

gencay / vscode-chatgpt

An unofficial Visual Studio Code - OpenAI ChatGPT integration

ISC License

3.5k stars 755 forks source link

Token counting information #207

Closed ProtoxiDe22 closed 1 year ago

ProtoxiDe22 commented 1 year ago

Describe the feature

I'm wondering if it would be possible to have some kind of tooltip/information in the conversation window to see at a glance how many tokens have been used for the conversation/will be used on the next message

ncesar commented 1 year ago

You can get this date in OpenAI Dashboard. I don't think they have an api to show these data.

ProtoxiDe22 commented 1 year ago

I was just researching this. While there's no api there seems to be multiple ways of doing this, first of all there's a tokenizer tool right on the OpenAI site https://platform.openai.com/tokenizer There's also some guides that suggest how to do this on the OpenAI cookbook https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb This seems to suggest that this is very easy to do in python with tiktoken, but no js libraries are listed that support the newest encodings that can do that.

On the other hand, the tokenizer on the website runs in javascript locally, so it's definetly possible in js. If the code of that page was public, it'd be a really easy task

ProtoxiDe22 commented 1 year ago

Also, while counting token for the next request before sending it might require more work, counting tokens already utilized seems to be pretty trivial since the amount of token used is embedded in the API response implementing just this could already be a massive improvement, since you can at a glance see how many tokens the last request used, and from there you can just guesstimate the ballpark of how many tokens the next will use.

gencay commented 1 year ago

This could definitely be a great addition. While increasing the package size and startup time of the app, adding more verbose logs is something we are planning to address in the future releases. It's definitely useful. But for now, our goal is to keep the extension as minimal as possible with very few dependencies, so the experience is flawless and fast for developers!

ProtoxiDe22 commented 1 year ago

What about implementing the token count taking it from the response? that doesn't add any dependency or overhead and i think it should be pretty easy to implement