kserve / open-inference-protocol

Repository for open inference protocol specification
Apache License 2.0
42 stars 10 forks source link

Text Generate REST API schema #18

Closed gavrissh closed 9 months ago

gavrissh commented 11 months ago

Propose generate rest api endpoints

/v2/models/{model_name}/versions/${MODEL_VERSION}/generate
/v2/models/{model_name}/versions/${MODEL_VERSION}/generate_stream

[

Screenshot 2024-01-16 at 6 43 56 PM Screenshot 2024-01-16 at 6 44 46 PM Screenshot 2024-01-16 at 6 46 46 PM

](url)

Reference - https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/protocol/extension_generate.html#generate-extension

gavrissh commented 11 months ago

@yuzisun Wanted to follow up, if the current state of changes are alright?

gavrissh commented 10 months ago

I have updated with all the recent discussed changes

cmaddalozzo commented 10 months ago

We should probably add the option to return log probabilities in the result. This seems to be fairly common among other APIs. This would comprise a boolean logprobs parameter in the request and a corresponding logprobs property in the response containing an array of objects with keys token and logprob.

gavrissh commented 10 months ago

We should probably add the option to return log probabilities in the result. This seems to be fairly common among other APIs. This would comprise a boolean logprobs parameter in the request and a corresponding logprobs property in the response containing an array of objects with keys token and logprob.

I have updated the PR to support the above items

yuzisun commented 10 months ago

Thanks @gavrishp !! Great job on getting this going with the initial version.

/lgtm /approve

oss-prow-bot[bot] commented 10 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gavrishp, yuzisun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kserve/open-inference-protocol/blob/main/OWNERS)~~ [yuzisun] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment