GptManager’s scalability issues with input & output parameters

triton-inference-server / tensorrtllm_backend

The Triton TensorRT-LLM Backend

Apache License 2.0

581 stars 81 forks source link

GptManager’s scalability issues with input & output parameters #437

Open service-kit opened 2 months ago

service-kit commented 2 months ago

Gpt Manager does not support the expansion of input & output parameters, resulting in the inability to add tensorrtllm inference parameters. Will it be supported in the future? Like This：

service-kit commented 2 months ago

Our requirement is to support new parameters to the tensorrtllm engine. Parameter expansion can be supported at the backend level. Currently, it is blocked by the closed source of GptManager.