Closed transitive-bullshit closed 6 months ago
bundle size is obv important, but do we care about bundle size? bundlephobia is showing 238k minified bundle but that seems like it has to be wrong with tiktoken.
also, have you looked in the js tokenizer libs lately? is there a better one yet?
agreed that it's not a priority; just bringing it up because gptlint
came in at 25MB and 80% of that was dexter
which was surprising to me.
also, have you looked in the js tokenizer libs lately? is there a better one yet?
Not that I'm aware of; langchain is still using js-tiktoken
and I haven't seen any others gain wide adoption.
ok lets close this then considering it's mostly tiktoken and we don't have a good alternative.
Currently sitting at ~17MB with 14MB coming from
tiktoken
: https://pkg-size.dev/@dexaai%2FdexterSee also https://github.com/dqbd/tiktoken/issues/68
For comparison, here's langchain at ~36MB: https://pkg-size.dev/langchain but we should be a lot slimmer than this. Langchain's not even loading the full
tiktoken
WASM lib; they're using the 6.6MBjs-tiktoken
.This issue may end up just being resolved by improving
tiktoken
's WASM bundle size upstream, but I wanted to track it while it's top of mind forgptlint
.