Closed gautierdag closed 10 months ago
Thanks for the bug report.
It was caused by the maximum token ID overflowing from uint16 65536 to 0 when checking for valid token IDs, thereby rendering all token IDs invalid.
I've fixed this and pushed the change. It requires updating to the latest version (1.1.11) pip install --upgrade --no-cache-dir tokenmonster
. Please let me know if you encounter any issues.
Hi, I was just trying out the code tokenizers, seems like all the
code-65636-*
models are all unable to decode:The 100k and 32k models work.