zurawiki / tiktoken-rs

Ready-made tokenizer library for working with GPT and tiktoken
MIT License
240 stars 47 forks source link

return correct context length for `text-embedding-ada-002` #41

Closed ursachec closed 11 months ago

ursachec commented 11 months ago

from https://openai.com/blog/new-and-improved-embedding-model:

Longer context. The context length of the new model is increased by a factor of four, from 2048 to 8192, making it more convenient to work with long documents.
ursachec commented 11 months ago

@zurawiki a small fix for your consideration. A review would be highly appreciated.

zurawiki commented 11 months ago

Thank you for this contribution @ursachec. Once CI passes, I will merge this PR and ship a new release

ursachec commented 11 months ago

That's great, thanks for the super-fast response @zurawiki!