karpathy / minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
MIT License
9.19k stars 866 forks source link

What to support GPT-4O tokenizer? #77

Open echo-valor opened 6 months ago