triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.34k stars 1.49k forks source link

Version specific config.pbtxt #7406

Open lminer opened 4 months ago

lminer commented 4 months ago

We would like to be able to deploy multiple versions of the same model. Unfortunately, they will not necessarily always have the same shapes and dtypes.

It would be great to have a per version config.pbtxt (maybe nested in the version directory itself).

Another option would be to simply have different model names for these versions, but that doesn't seem as clean as they are so tightly related and it would lead to a proliferation of names.

sourabh-burnwal commented 4 months ago

@lminer there are 2 ways you can solve your issue:

  1. Come up with a consistent data structure that can handle different versions of this model.
  2. Use optional inputs. Refer: https://github.com/triton-inference-server/server/issues/3419
lminer commented 4 months ago

Thanks for the suggestions. I did not know about the optional inputs! Unfortunately, both of those solutions add a lot more complexity that is difficult to manage and would be bug prone, at least in our context. We'll probably just opt to add more model names. I just want to reiterate that I think this feature request would be a nice longterm solution. Input and output shapes inevitably change as models evolve, and this would allow for easier swapping between different versions, without having to define dozens of version specific inputs and outputs in the config.pbtxt.