IAPark / tiktoken_ruby

Unofficial ruby binding for tiktoken by way of rust
MIT License
109 stars 26 forks source link

[Proposal] Add model token limits to tiktoken_ruby #6

Closed mattlindsey closed 4 months ago

mattlindsey commented 1 year ago

Would you be opposed to adding token limits to https://github.com/IAPark/tiktoken_ruby/blob/main/lib/tiktoken_ruby.rb and a get_token_limit method?
Seems better here than in the project I'm helping with (https://github.com/andreibondarev/langchainrb).

TOKEN_LIMITS = {

Source:

    # https://platform.openai.com/docs/api-reference/embeddings
    # https://platform.openai.com/docs/models/gpt-4
    "text-embedding-ada-002" => 8191,
    "gpt-3.5-turbo" => 4096,
    "gpt-3.5-turbo-0301" => 4096,
    "text-davinci-003" => 4097,
    "text-davinci-002" => 4097,
    "code-davinci-002" => 8001,
    "gpt-4" => 8192,
    "gpt-4-0314" => 8192,
    "gpt-4-32k" => 32768,
    "gpt-4-32k-0314" => 32768,
    "text-curie-001" => 2049,
    "text-babbage-001" => 2049,
    "text-ada-001" => 2049,
    "davinci" => 2049,
    "curie" => 2049,
    "babbage" => 2049,
    "ada" => 2049
  }
IAPark commented 9 months ago

Ah, do you know if that's part of the python tiktoken library?