Ai00-X / ai00_server

A localized open-source AI server that is better than ChatGPT.
https://ai00-x.github.io/ai00_server/
MIT License
465 stars 58 forks source link

Feature request: Huggingface tokenizer support #94

Open melang982 opened 6 months ago

melang982 commented 6 months ago

Since world tokenizer training code is not available as far as I know, those of us who need a custom tokenizer train HF tokenizer (pip rwkv package, RWKV-LM trainer and json2binidx_tool all support it). Currently it doesn't work with ai00_server:

[ai00_server::middleware] reload model failed: failed to parse vocabulary: invalid value: expected key to be a number in quotes at line 2 column 3

melang982 commented 6 months ago

Implemented this, pull request: https://github.com/cryscan/web-rwkv/pull/23