dqbd / tiktoken

JS port and JS/WASM bindings for openai/tiktoken
MIT License
648 stars 49 forks source link

Missing ranks files from js-tiktoken #108

Open jonasb opened 3 weeks ago

jonasb commented 3 weeks ago

I'm using js-tiktoken and haven't been able to upgrade since version 1.0.7. It turns out that there are no ranks files in the latest version of js-tiktoken.

Is the removal of the ranks files intentional? Or is there some configuration error in js-tiktoken?

This is how I've been using js-tiktoken:

import { Tiktoken } from "js-tiktoken/lite";
import cl100k_base from "js-tiktoken/ranks/cl100k_base";

const encoder = new Tiktoken(cl100k_base)

A tree listing from js-tiktoken@1.0.12

1.0.12
tree node_modules/js-tiktoken 
node_modules/js-tiktoken
├── README.md
├── dist
│   ├── chunk-PEBACC3C.js
│   ├── core-262103d7.d.ts
│   ├── index.cjs
│   ├── index.d.ts
│   ├── index.js
│   ├── lite.cjs
│   ├── lite.d.ts
│   └── lite.js
├── index.d.ts
├── index.js
├── lite.d.ts
├── lite.js
└── package.json

2 directories, 14 files

A tree listing from js-tiktoken@1.0.7

node_modules/js-tiktoken
├── README.md
├── dist
│   ├── chunk-BJSHOR2F.js
│   ├── chunk-EFS4X6KN.js
│   ├── chunk-F7G2FLS4.js
│   ├── chunk-H4GMFLYA.js
│   ├── chunk-LWEZBMPN.js
│   ├── chunk-THGZSONF.js
│   ├── chunk-XXPGZHWZ.js
│   ├── core-810722a7.d.ts
│   ├── index.cjs
│   ├── index.d.ts
│   ├── index.js
│   ├── lite.cjs
│   ├── lite.d.ts
│   ├── lite.js
│   └── ranks
│       ├── cl100k_base.cjs
│       ├── cl100k_base.d.ts
│       ├── cl100k_base.js
│       ├── gpt2.cjs
│       ├── gpt2.d.ts
│       ├── gpt2.js
│       ├── p50k_base.cjs
│       ├── p50k_base.d.ts
│       ├── p50k_base.js
│       ├── p50k_edit.cjs
│       ├── p50k_edit.d.ts
│       ├── p50k_edit.js
│       ├── r50k_base.cjs
│       ├── r50k_base.d.ts
│       └── r50k_base.js
├── index.d.ts
├── index.js
├── lite.d.ts
├── lite.js
└── package.json

3 directories, 35 files
darrenangle commented 1 week ago

Same here. Breaks single encoder imports in js-tiktoken. Looks like the files are altogether missing in the repo somehow...