triton-inference-server / tensorrtllm_backend

The Triton TensorRT-LLM Backend
Apache License 2.0
581 stars 81 forks source link

GptManager’s scalability issues with input & output parameters #437

Open service-kit opened 2 months ago

service-kit commented 2 months ago

Gpt Manager does not support the expansion of input & output parameters, resulting in the inability to add tensorrtllm inference parameters. Will it be supported in the future? Like This: image

service-kit commented 2 months ago

Our requirement is to support new parameters to the tensorrtllm engine. Parameter expansion can be supported at the backend level. Currently, it is blocked by the closed source of GptManager.