mrsteele / openai-tokens

A service for calculating, managing, truncating openai tokens
MIT License
18 stars 3 forks source link

fix: add missing encoder for 4o models #40

Closed randompixel closed 3 weeks ago

randompixel commented 1 month ago

When using a gpt-4o as a model, the current release is failing when trying to get an encoder (all the unit tests use 3.5-turbo as a model)

TypeError: Cannot read properties of undefined (reading 'encode')

If we look at js-tiktoken, gpt-4o uses o200k_Base as an encoder which isn't listed in the model list of encoder.js.

This Pull Request adds the new o200k_Base in to the encoder list. I added a simple test to make sure a variety of models return an encoder. Ideally I would have liked to have looped the const models from models.js but that isn't exported and I wasn't confident in changing that to be exported.

I also bumped the dependency to .14 as this is the minimum js-tiktoken release that supports 4o models. I think that's better than just relying on the ^?

randompixel commented 4 weeks ago

Rebased branch based on latest chore(deps) updates on main

@mrsteele; would you accept this PR in to a bugfix release?

mrsteele commented 3 weeks ago

Looks great! Thanks for the fix!

github-actions[bot] commented 3 weeks ago

:tada: This PR is included in version 2.3.6 :tada:

The release is available on:

Your semantic-release bot :package::rocket: