knuddelsgmbh / jtokkit

JTokkit is a Java tokenizer library designed for use with OpenAI models.
https://jtokkit.knuddels.de/
MIT License
518 stars 38 forks source link

Add convenience method to count tokens from a list of `com.theokanning.openai.completion.chat.ChatMessage` #46

Closed sualeh closed 6 months ago

sualeh commented 11 months ago

The recipe to count tokens in chatml.md does not seem to account for functions. This issue is to request a create a utility or convenience method to count tokens from a list of com.theokanning.openai.completion.chat.ChatMessage messages, which takes functions and other types of messages into account.

tox-p commented 10 months ago

A utility method is out of scope for this library which only focus is the tokenization algorithm, but I can totally update the docs for the recipe as soon as I find time

For now, I can refer you to the following anwer which did help at least one person with a similar question: https://github.com/knuddelsgmbh/jtokkit/issues/30#issuecomment-1594190763

lukaszkorecki commented 10 months ago

Hi! I wrote a Clojure wrapper for jtokkit that does this - you could potentially use it from any JVM language, although it will require working with Clojure's Java API.