deepjavalibrary / djl-serving

A universal scalable machine learning model deployment solution
Apache License 2.0
189 stars 63 forks source link

Ability to transform model outputs in DJL Serving #1214

Open rachitchauhan43 opened 10 months ago

rachitchauhan43 commented 10 months ago

Description

We are using SageMaker for large model inference (LMI) as documented here

With this notebook https://github.com/deepjavalibrary/djl-demo/blob/master/aws/sagemaker/large-model-inference/sample-llm/rollingbatch_llama_7b_customized_preprocessing.ipynb, we saw there is a way to manipulate the input before it goes to then model because parse_input method is available as a hook but we also have a need to manipulate the output before it leaves model server.

@lanking520 and @frankfliu Any thoughts on supporting that ?

Will this change the current api? How? No. It's just an extension we are asking for.

Benefit of this will be: Users won;t have to write another service layer in front of model server just to manipulate/transform outputs.

Who will benefit from this enhancement?

References

rachitchauhan43 commented 10 months ago

cc: @chirag-orbittec

frankfliu commented 10 months ago

@rachitchauhan43

We do have plan to allow customize output. We have a few built-in output_formatter for rolling batch. We are considering allows user to point the output_formatter to a user provided module.

Are you considering to contribute to this feature?

chirag-orbittec commented 10 months ago

@frankfliu I can contribute that feature. We need this