triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.27k stars 1.47k forks source link

Add model warmup functionality for ensemble models #6877

Open Timen opened 8 months ago

Timen commented 8 months ago

Is your feature request related to a problem? Please describe. Support model warmup for ensemble models.

Describe the solution you'd like The option to specify warmup inputs for ensemble models.

Describe alternatives you've considered Manually warming up the pipeline

kthui commented 8 months ago

Hi @Timen, does setting the Model WarmUp on all composing models not helped? The ensemble model only passes inputs and outputs between composing models and does not do any real work.

Timen commented 8 months ago

The effort of setting up the model warmup for each individual part of a pipeline vs just one input to the ensemble is much greater. It just is surprising and not documented that ensembles don't support it.

kthui commented 8 months ago

@GuanLuo @nnshah1 do you want to file a ticket for adding warmup config into ensemble model config?

Timen commented 7 months ago

@kthui @GuanLuo @nnshah1 any updates on this, this is still a very welcome feature.