Closed XiaobingSuper closed 4 months ago
@schetlur-nv
Hi @XiaobingSuper , thanks a lot for your great contribution! We've integrated your changes in PR https://github.com/triton-inference-server/tensorrtllm_backend/pull/454 and credited you as co-author, hence I'm going to close this PR.
Let me know if you have any questions, and thanks again for help make TensorRT-LLM better.
For HF
Tokenizer.encode
, theadd_special_tokens
default value isTrue
(https://huggingface.co/transformers/v2.11.0/main_classes/tokenizer.html#transformers.PreTrainedTokenizer.encode), but tensorrtllm_backend default value isFlase
, and if the user doesn't set it atpreprocessing/config.pbtxt
, there may have some issues: https://github.com/triton-inference-server/tensorrtllm_backend/issues/445 and https://github.com/triton-inference-server/tensorrtllm_backend/issues/434. I think we need to align the behavior of tensorrtllm_backend with HF and not expect the user to set it before launching a model server(same asTokenizer.decode
).