Closed ash2703 closed 2 months ago
@Tabrizian Sorry for tagging directly, any help is appreciated Thanks!
I have this use case where I want to share the same input across multiple models such that it can be inferred in parallel and independently Now this is completely supported by ensemble, but I have a use case where i would like to be selective in which model receives the input, there are chances that I want only 1 of 5 models to do the inferencing.
The general flow is like this
graph TD A[Input: image] --> B[detector_preprocess] B --> |input| C[detector] B --> |preprocess_ratio| D[detector_postprocess] C --> |output| D D --> |boxes| E[recognizer_preprocess] A --> |image| E E --> |preprocessed_crops| F1[recognizer lang 1] E --> |preprocessed_crops| F2[recognizer lang 2] E --> |preprocessed_crops| F3[recognizer lang 3] E --> |preprocessed_crops| F4[recognizer lang 4] F1 --> |recognition_output| G[recognizer_postprocess] F2 --> |recognition_output| G F3 --> |recognition_output| G F4 --> |recognition_output| G G --> |decoded_text| H[Output: decoded_text] G --> |confidence_scores| I[Output: confidence_scores]
The flow above is possible via ensemble, but say for a given request I only wish to run
recognizer lang 1
andrecognizer lang 2
Is this possible via ensemble, if not how can i leverage BLS while still keeping this flow.7589
I think it is possible to do that by selecting the model you want to infer in the request check this simple example on how to select the model you want to infer.
So the idea is to treat the recognizer_preprocess
as a BLS within the ensemle, essentially the ensemble is only till the preprocessing layer and recognizers
are separate models outside the ensemble?
I think you can do the whole process in a BLS, I don't know if it is possible to use ensemble models in a BLS as I never tried but I don't see why i wouldn't work.
You could have:
input
to recognizer_preprocess
recognizer_postprocess
to your outputsAssuming your recognizer_postprocess
deals with zeroed inputs, just fill your unused recognizer outputs with zeros (or then with anything your postprocess will ignore) and it should work
Thanks! This atleast gave me the motivation to split into 2 ensembles, Will keep updated if the middle routing works
As @gpadiolleau mentioned, you can call whatever model you want in BLS (including ensembles) so you should be able to create that pipeline in BLS.
Thanks @gpadiolleau @Tabrizian
This worked!
Created an ensemble till the recognizer_preprocess
and mini ensembles for recognizer + recognizer_postprocess
And a BLS router at the end of first ensemble does the magic!
I have this use case where I want to share the same input across multiple models such that it can be inferred in parallel and independently Now this is completely supported by ensemble, but I have a use case where i would like to be selective in which model receives the input, there are chances that I want only 1 of 5 models to do the inferencing.
The general flow is like this
The flow above is possible via ensemble, but say for a given request I only wish to run
recognizer lang 1
andrecognizer lang 2
Is this possible via ensemble, if not how can i leverage BLS while still keeping this flow.https://github.com/triton-inference-server/server/issues/7589