triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.39k stars 1.49k forks source link

Support passing variables in config.pbtxt #7530

Open riZZZhik opened 3 months ago

riZZZhik commented 3 months ago

Is your feature request related to a problem? Please describe. Variables like max_batch_size, dynamic_batching, etc. need to be changed based on the hardware environment and third-party claims every run Right now there isn't a convenient way to do so

Describe the solution you'd like Support environment variables inside config

example config.pbtxt:

...
max_batch_size: ${MAX_BATCH_SIZE} or $$MAX_BATCH_SIZE
...

and/or pass them to tritonserver command:

tritonserver --model-repository models --custom_variables '{"var1": "value1"}'

Describe alternatives you've considered I wrote shell script to be runned before tritonserver with sed command to replace based on env It works, but it isn't convenient and poses issues inside Kubernetes

for config in $(find $MODELS_PATH -type f -name 'config.pbtxt'); do
    sed -i -e "s/max_batch_size: [0-9]\{1,\}/max_batch_size: ${max_batch_size}/g" $config
done