alondmnt / joplin-plugin-jarvis

Joplin (note-taking) assistant running a very intelligent system (OpenAI/GPT, Hugging Face, Gemini, Llama, Universal Sentence Encoder, etc.)
GNU Affero General Public License v3.0
209 stars 22 forks source link

better word count with dqbd/tiktoken #10

Closed ahxxm closed 1 year ago

ahxxm commented 1 year ago

it's OpenAI models' tokenizer, allowing a precise split the pure js version is not small, its unpack size is 10M https://www.npmjs.com/package/js-tiktoken

alondmnt commented 1 year ago

Thanks for the suggestion! I'll have a look at it. However, since I recently added many models other than OpenAI's, it makes it harder to maintain support for many tokenizers. Perhaps I'll add it in the future, but for now I'll keep an approximated token count.