openai / tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.
MIT License
12k stars 818 forks source link

added two new embedding model's encoding #247

Closed Praneet460 closed 8 months ago

Praneet460 commented 8 months ago

Problem Library doesn't support two new embedding model's encoding mapper

tiktoken.encoding_for_model("text-embedding-3-small") raises a KeyError

Screenshot 2024-01-27 at 1 08 05 AM

Solution Added Encoding mapper for 2 new embedding models. The source of mapping is taken from here

hoonlight commented 8 months ago

@hauntsaninja Hi, can you check this PR?

stevieflyer commented 8 months ago

Really looking forward to this merge

chrispy-snps commented 8 months ago

Thanks for adding these!

usamasaleem1 commented 8 months ago

Can we merge this to main so I can start using the new models!

Vaibhav2001 commented 8 months ago

Can we merge this to main so I can start using the new models!

+1

itarutomy97 commented 8 months ago

+1

jnance314 commented 8 months ago

+1

emsi commented 8 months ago

For the meantime you can just: pip install -U git+https://github.com/Praneet460/tiktoken@Add-New-Embedding-Models

will-mako-ai commented 8 months ago

+1

ByeongUkChoi commented 8 months ago

+1

hauntsaninja commented 7 months ago

This has been released in tiktoken 0.6