deepjavalibrary / djl-serving

A universal scalable machine learning model deployment solution
Apache License 2.0
183 stars 58 forks source link

[3p][python]add metering and error details to 3p outputs #2143

Closed siddvenk closed 1 week ago

siddvenk commented 1 week ago

Description

3p Changes:

  1. Fix the output formatters to respond with the correct expected payload schema
  2. Add metering and error details to the output schema for 3p use-case
  3. force do_sample to be true when temperature > 0

General Change:

  1. Add exception details to the Token that is set when rolling batch inference occurs. This makes it possible for output formatters to further provide specific error details in the response rather than just saying "error". This is currently only used by the 3p output formatters.