Closed paschembri closed 2 months ago
Thank you for your proposal. There's an easy way to add a 3rd party dependency actually.
It should be added this way. Unlikely I'd found that only after I did implement it within stdlib boundaries.
Anyway, I'd rather prefer to not add transformers
package if it's more than let's say 10mb with all its dependencies. This plugin is pretty light and straightforward so far, and I'd like to keep it that way.
So here's my proposal towards to you: if the transformers
package are not so big one -- you are very welcome to use in in your PR, and if it's not please consider to made tokens count less precise p, but without such package.
I’ll look into it but I know the transformers package does magic stuff (like downloading vocab files…)
I think extracting a basic rule may be simpler…
thanks for the link on how to add deps - will be useful…
I've found the library that is suggested for counting tokens locally. Have to check it's size before include it in project anyway.
https://github.com/wbond/package_control_channel/blob/master/repository/dependencies.json
It appears, that only packages from within this list are available for being dependencies.
I managed to get a tiktoken server running locally (this is overkill but fits my need)
Small plugin here : https://github.com/paschembri/sublime-plugin-maker/tree/main/examples/Tiktoken-Counter Tiktoken server here : https://github.com/paschembri/tiktoken-server
@paschembri believe it could be toggled feature. Like if a user wants it so hard to install smth in addition — he or she could have it, but by destiny it would be disabled.
And just curious, why do you need it that much? Cold you please elaborate your flow?
The issue arises from not knowing the prompt size before launching a request.
So being able to know the token count from the text selection is key
@paschembri am I get you right, that you're mostly working with an edged scale text corpuses, like the one that's pretty close to the upper bound of a model?
I mean, from my side I'm hitting such limit of a single prompt close to never on both gpt-4
and got-turbo-32k
even though I'm using pretty big code chunks to passing in in my daily workflow. So I still trying to find the answer, why does such accuracy is necessary? Even taking in account an oncoming error, it shouldn't be that large in a scale under 10k tokens to affects user flow in a dramatic way, if we'd just presume that each token ~ 4 chars. So what do I missing?
P.S: I'm not arguing in any way here. Just trying to dig deep enough in your case.
I am using different models (3.5-turbo, 3.5-turbo-16k and 4) but I don't have access to the gpt-4-32k and I find it useful to compare outputs from different models.
According to the prompt and the use case, there are 2 edge cases that I often encounter:
Now, you're totally right with we'd just presume that each token ~ 4 chars
large prompt => small expected output : this is the worst case scenario where I want to be able to tell how much token is the prompt
In the very last release (2.1.0) this case is covered in a straightforward way: if the sum of prompt and max_token
values is exceeds a model limit the OpenAI gateway throws an error with the exact numbers of both exact prompt tokens amount that have been passed and max_tokens
. Before the last release i've just silenced that message, but now I passthrough to a user by sublime error message popup and the sublime logs printout. Have you tried that already, does it met your needs?
Anyway I sill believe that it's a good feature to count it at least in some approximate way local, and maybe it's worth a shot to add your even more precise solution as an optional one. But before doing latter I have to figure out its use cases to design a proper solution.
https://github.com/yaroslavyaroslav/OpenAI-sublime-text/issues/29#issuecomment-1837598655 This news affects this issue as well.
Token count is currently based on the length of the selection (
if (self.settings.get("max_tokens") + len(self.text)) > 4000:
).However, OpenAI models are using
GPT2Tokenizer
which leads to a more accurate token count.As of now I didn't find an easy way to include 3rd-party deps for a sublime package (the python package
transformers
should be used).What do you think about adding a setting enabling the user to override the 4000 limit ? (Happy to provide a PR).