Open WilliamOnVoyage opened 3 months ago
Hi @WilliamOnVoyage, I believe both the vLLM and TensorRT-LLM backends handle tokenization internally without user-code-changes required, and are configurable through their respective config files or based on the model being used. Does this satisfy your needs?
Hi team,
Currently the working pattern for server side tokenization is for users to write a
model.py
with python backend to perform tokenization, which is great for flexibility and customization.While given the rise of language models and popularity of some common model / tokenizer architecture, I'm wondering if you plan to provide tokenizer support natively so users can configure tokenizer just through tokenizer artifacts and
config.pbtxt