BlinkDL / RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
Apache License 2.0
12.61k stars 859 forks source link

Why doesn't vocab.json provide #87

Closed hofe7 closed 1 year ago

hofe7 commented 1 year ago

Shouldn't the pth file and vocab.json be created based on different datasets? Please teach me if I am wrong.

BlinkDL commented 1 year ago

All Pile & Raven models are using 20b_tokenizer from GPT-NeoX 20b