openai / tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

MIT License

12.48k stars 856 forks source link

Adds caching to get_encoding to avoid repeatedly constructing Encodings #248

Closed tal7aouy closed 9 months ago

tal7aouy commented 9 months ago

Summary

This PR adds caching to get_encoding to avoid repeatedly constructing Encodings.

Implementation

Adds @lru_cache decorator to cache get_encoding return values
Sets maxsize=None to cache all encodings
First call constructs encoding as normal
Subsequent calls hit cache and avoid reconstructing
Locking and global state still handled properly

Benefits

Significant speedups when requesting the same encodings multiple times
Simple change with minimal code changes
Fully backwards compatible

Future Improvements
Add max cache size to bound memory usage
Support cache keying off config instead of just name

Overall this simple change provides large performance improvements by caching encoding objects.

Please let me know if any changes are needed!

hauntsaninja commented 9 months ago

There is already caching