Closed cloudyuyuyu closed 5 months ago
Please refer to tokenization_note.md in this repo. Essentially, they are different algorithms and converting BPE vocabulary at the byte-level to sentencepiece vocabulary simply cannot be done due to inherent differences.
起始日期 | Start Date
No response
实现PR | Implementation PR
No response
相关Issues | Reference Issues
No response
摘要 | Summary
端侧推理引擎只能兼容 tokenizer.model,无法支持tiktoken模式
基本示例 | Basic Example
端侧推理引擎只能兼容 tokenizer.model
缺陷 | Drawbacks
端侧推理引擎只能兼容 tokenizer.model
未解决问题 | Unresolved questions
No response