Open cahya-wirawan opened 1 month ago
This tokenizer is 2.5x slower than other huggingface tokenizers and the original blinks world tokenizer. The comparison can be tested here: https://colab.research.google.com/gist/cahya-wirawan/932f95ece55c838e186dc3b1c9fcbef4/rwkv-tokenizers.ipynb
It generates also difference token ids for following edge cases:
This tokenizer is 2.5x slower than other huggingface tokenizers and the original blinks world tokenizer. The comparison can be tested here: https://colab.research.google.com/gist/cahya-wirawan/932f95ece55c838e186dc3b1c9fcbef4/rwkv-tokenizers.ipynb
It generates also difference token ids for following edge cases: