[Efficiency] Inefficient to create encoders multiple times

omar-scio commented 2 weeks ago

I noticed our tests were much slower when switching from https://github.com/tiktoken-go/tokenizer to this library. It seems to be because tiktoken.EncodingForModel is very slow (0.3s on my machine).

Our I see that the source code is caching the encoding itself, but at this point, why not cache the *Tiktoken? That's what we did to fix this in our own codebase, and the latency went away.

I can make a PR for this if there is not much time available for the maintainers.

pkoukk commented 1 week ago

This library already includes a caching mechanism. You can save time on network consumption for obtaining the token dictionary by setting the TIKTOKEN_CACHE_DIR environment variable or using the Offline BPE loader OpenAI has not stated that the token dictionary will always remain unchanged, so the caching mechanism is disabled by default.

If you need to cache initialized TikToken objects, you can set a global variable in your own code. The Encode function is thread-safe. I do not wish to implement this behavior in the library itself, because not everyone needs it and it would consume extra memory.

cemremengu commented 5 days ago

@omar-scio sorry this is unrelated but would you mind telling why you have switched libs? I was comparing both and was curious when I saw yor comment.

pkoukk / tiktoken-go

[Efficiency] Inefficient to create encoders multiple times #51