mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.23k stars 1.58k forks source link

[3rdparty] Bump tokenizer-cpp to enable SentencePiece by default #3025

Closed MasterJH5574 closed 1 week ago

MasterJH5574 commented 1 week ago

This PR bumps the 3rdparty/tokenizer-cpp.

The SentencePiece tokenizer was disabled by default to reduce the binary size of the built library, while it causes some error when users expect to use the SentencePiece tokenizer.

This PR enables it by default. And we will need to manually disable it if we need to reduce its binary size.