Closed zengqingfu1442 closed 6 months ago
do tokenization and de-tokenization yourself?
do tokenization and de-tokenization yourself?
yes. custom backend based on triton python backend.
If you use bls model like https://github.com/triton-inference-server/tensorrtllm_backend/blob/main/all_models/inflight_batcher_llm/tensorrt_llm_bls/config.pbtxt, I think it's compatible as the model inputs are compatible.
The inputs and outputs of my custom model is above. How can i adjust my model to make it compatible?
Can i use a single model in triton model repository?