Investigate if OpenAI Streaming API can be used

sberyozkin commented 10 months ago

Hopefully it can be considered worth investigating. I've read around, quite a few related discussions, and one of the main techniques to get faster OpenAI response time is apparently supporting a streaming API. For example, with a ChatBot sample, users would see a response being formed gradually, word by word, or sentence by sentence, minimising the effect of a somewhat slow response. Thanks

geoand commented 10 months ago

We do implement the streaming API, however it's not used in AI services. Getting it there would be great, but likely pretty hard to do

sberyozkin commented 10 months ago

We do implement the streaming API

Do you mean on the RestEasy Reactive side ? Sure, what I meant, is, https://platform.openai.com/docs/api-reference/streaming, but I definitely don't mean to suggest it can be done easily :-), I totally agree. OpenAI pushes them via SSE, see also https://community.openai.com/t/what-is-this-new-streaming-parameter/391558/9

So, just a wild theory, https://github.com/quarkiverse/quarkus-langchain4j/blob/main/samples/chatbot/src/main/java/io/quarkiverse/langchain4j/sample/chatbot/ChatBotWebSocket.java#L40 can be done by connecting an SSE response from the Chat Bot to RestEasy Reactive SSE output (as opposed to a WebSocket, possibly in some other demo).

Apparently (all or some of) OpenAI API have a boolean streaming option now.

But in any case, agree it may not be straightforward

geoand commented 10 months ago

I meant that we implement StreamingChatModel from Langchain4j

geoand commented 10 months ago

The hard part of making it work with AI services

geoand commented 10 months ago

I am actually going to close this in favor of https://github.com/quarkiverse/quarkus-langchain4j/issues/105 which is more targeted

sberyozkin commented 10 months ago

@geoand Yeah, that is better, thanks

quarkiverse / quarkus-langchain4j

Investigate if OpenAI Streaming API can be used #118