openai / tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.
MIT License
12.48k stars 856 forks source link

added two new embedding model's encoding #247

Closed Praneet460 closed 9 months ago

Praneet460 commented 10 months ago

Problem Library doesn't support two new embedding model's encoding mapper

tiktoken.encoding_for_model("text-embedding-3-small") raises a KeyError

Screenshot 2024-01-27 at 1 08 05 AM

Solution Added Encoding mapper for 2 new embedding models. The source of mapping is taken from here

hoonlight commented 10 months ago

@hauntsaninja Hi, can you check this PR?

stevieflyer commented 10 months ago

Really looking forward to this merge

chrispy-snps commented 9 months ago

Thanks for adding these!

usamasaleem1 commented 9 months ago

Can we merge this to main so I can start using the new models!

Vaibhav2001 commented 9 months ago

Can we merge this to main so I can start using the new models!

+1

itarutomy97 commented 9 months ago

+1

jnance314 commented 9 months ago

+1

emsi commented 9 months ago

For the meantime you can just: pip install -U git+https://github.com/Praneet460/tiktoken@Add-New-Embedding-Models

will-mako-ai commented 9 months ago

+1

ByeongUkChoi commented 9 months ago

+1

hauntsaninja commented 9 months ago

This has been released in tiktoken 0.6