mlc-ai / tokenizers-cpp

Universal cross-platform tokenizers binding to HF and sentencepiece
Apache License 2.0
210 stars 46 forks source link

is there any plan to support Tiktoken? #23

Open Jasonsey opened 6 months ago

Jasonsey commented 6 months ago

I am trying to make mlc to support Qwen, but the model's tokenizer use tiktoken developed by openai, which is not supported by this repo now? So is there any plan for this feature?

tqchen commented 6 months ago

As of now we don't have planned effort yet, contribution is welcomed

gaokao123 commented 3 weeks ago

@Jasonsey for tiktoken of qwen, Is there a solution now?