Azure / azure-sdk-for-java

This repository is for active development of the Azure SDK for Java. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/java/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-java.
MIT License
2.35k stars 1.99k forks source link

Structure completion request to maximize Prompt Caching #42805

Open brandonh-msft opened 1 week ago

brandonh-msft commented 1 week ago

Today, the current flow of a request through to an OpenAI service relies on simple JSON-serialization of a model to encode the message to BinaryData and send it through the pipeline.

This does not maximize Prompt Caching capabilities, where the completion request should have tools, then history, then new content - in that order. Additionally, the tools and history must be in the same order every time (suggest alpha order by tool name).

Sources: https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/prompt-caching https://openai.com/index/api-prompt-caching/ https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/prompt-caching#what-is-cached

Asks for BinaryData from the options: https://github.com/Azure/azure-sdk-for-java/blob/cc459eee27b3b7b59452faf41e69101f017cd816/sdk/openai/azure-ai-openai/src/main/java/com/azure/ai/openai/OpenAIClient.java#L726

Which simply uses a default serialization implementation to turn the CompletionChatOptions into BinaryData https://github.com/Azure/azure-sdk-for-java/blob/cc459eee27b3b7b59452faf41e69101f017cd816/sdk/core/azure-core/src/main/java/com/azure/core/util/BinaryData.java#L614-L615 https://github.com/Azure/azure-sdk-for-java/blob/cc459eee27b3b7b59452faf41e69101f017cd816/sdk/core/azure-core/src/main/java/com/azure/core/util/BinaryData.java#L181

Additional context

https://github.com/microsoft/semantic-kernel/discussions/9444 https://github.com/openai/openai-dotnet/issues/281

mssfang commented 1 week ago

Hi, @brandonh-msft Currently, Java SDK is working on the service API version 2024-10-01-preview, Will keep you posted when it released.

Are you suggest ChatCompletionsOptions should always have tools goes ahead of messages and other properties?

brandonh-msft commented 1 week ago

well, I'm not, the feature does 😉

should be the structure in order to maximize prompt caching, per the docs for the feature from AOAI and OAI.