triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.18k stars 1.46k forks source link

allow model parameters to be specified in ensemble config #5982

Open charlesmelby opened 1 year ago

charlesmelby commented 1 year ago

Is your feature request related to a problem? Please describe. I have python components that I would like to use in multiple ensembles (both within a container but also in different projects), but require slightly different configuration in each place. This configuration can be handled easily if there is a mechanism to pass in dta directly or if the python model knows which ensemble it is being called by, but to do this currently requires custom setup inside the python config.pbtxt, which can be more error-prone and is not consistent with code sharing.

Describe the solution you'd like I'd like to be able to specify the value of a model parameter inside the corresponding Step field of the ensemble model:

step [
    {
      model_name: "preprocess_model"
      model_version: -1
      input_map {
        key: "RAW_INPUT"
        value: "INPUT"
      }
      output_map {
        key: "PREPROCESSED_OUTPUT"
        value: "OUTPUT"
      }
      parameters {
        key: "MY_PARAMETER"
        value: {string_value: "MY_VALUE"}
      }
    },
...

This solution is helpful because it can also be used to pass in configuration information explicitly or implicitly by passing in the calling ensemble's name.

Describe alternatives you've considered Another possibility is be being able to access the calling ensemble from the request, but this option would be less flexible and require a lot more boilerplate code for simple use cases.

kthui commented 1 year ago

Thanks for the enhancement suggestion. I have created a ticket for us to investigate further. DLIS-5065

Leggerla commented 2 months ago

Hi! Has this feature not been implemented so far? My code would really benefit from it.