langchain4j / langchain4j

Java version of LangChain
https://docs.langchain4j.dev
Apache License 2.0
4.88k stars 970 forks source link

AI Services streaming support #29

Closed breezexiao closed 1 year ago

breezexiao commented 1 year ago

can AIServices support streaming output?

langchain4j commented 1 year ago

@breezexiao what is your use case? Please share the code of how you expect to use this new API. Thanks

breezexiao commented 1 year ago

@breezexiao what is your use case? Please share the code of how you expect to use this new API. Thanks @langchain4j when using AIServices to develop applications, may be we need add some tool chains and EmbeddingStoreRetriever support, and the whole process will take a long time, so hope the streaming output can also be supported in AIServices, and then we push to frontend through sse etc .Thanks

langchain4j commented 1 year ago

@breezexiao I am afraid streaming will not work with tools as OpenAI does not support that. But we can add streaming support for the final response from the LLM (after the tool has been used and execution result was sent back to LLM). I would like to understand a bit better which libs/frameworks are you using and how streaming will be sent to the FE. Thank

breezexiao commented 1 year ago

@langchain4j get it, streaming support for the final response is a good idea. we send stream data through ResponseBodyEmitter, for example: ResponseBodyEmitter emitter = new ResponseBodyEmitter(); emitter. send(message); emitter. complete(); Thanks

langchain4j commented 1 year ago

@breezexiao ok, thanks. Will check today

breezexiao commented 1 year ago

@breezexiao ok, thanks. Will check today

@langchain4j hi, I found that streaming also supports function calls, such as this output message, hope it can help you {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"role":"assistant","content":null,"function_call":{"name":"getWeather","arguments":""}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"{\n"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":" "}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":" \""}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"arg"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"0"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"\":"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":" "}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"31"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"."}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"230"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"4"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":",\n"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":" "}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":" \""}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"arg"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"1"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"\":"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":" "}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"121"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"."}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"473"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"7"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":",\n"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":" "}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":" \""}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"arg"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"2"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"\":"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":" "}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"1"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"\n"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"}"}},"finish_reason":null}]} {"id":"chatcmpl-7cxMqb5Q4SmmQBdRQOWnk48aI3c1l","object":"chat.completion.chunk","created":1689519028,"model":"gpt-3.5-turbo-16k-0613","choices":[{"index":0,"delta":{},"finish_reason":"function_call"}]} [DONE]

langchain4j commented 1 year ago

@breezexiao wow that's cool! Too bad we still can't call the function before we get all arguments... Regarding the streaming support in general, I think we can introduce some class that you can define as a return type of a service method and then subscribe on it, something like this (not a final version, just an idea);

interface ServiceWithStreaming {

        Streamer chat(String message);
}

Streamer streamer = service.chat("hello");

streamer.onNewToken(token -> emitter.send(token))
        .onError(e -> emitter.completeWithError(e))
    .onComplete(() -> emitter.complete())

Much simpler option would be to support returning ResponseBodyEmitter/SseEmitter directly from the service, like this:

interface ServiceWithStreaming {

        ResponseBodyEmitter chat(String message);
}

This way you won't have to re-map from Streamer to Emitter, but I have to check if this is possible to do without introducing lots of spring dependencies to the library... If you have ideas, please share.

breezexiao commented 1 year ago

@langchain4j em I have an idea, can we extend StreamingResultHandler, such as add a subclass to proxy StreamingResultHandler, etc StreamingFunctionCallResultHandler, then add a method to handle functionCall, such as onFunctionCall(AIMessage), when the client.chatCompletion(request).onPartialResponse() method is executed, we judge whether it is a function_call according to the function_call field in the returned delta. If so, temporarily collect store the function name and arguments of the function_call. When onComplete called, if it is a function call, execute onFunctionCall (AIMessage), in onFunctionCall Execute the relevant logic of the tool chain, similar to AiServices, else execute the Proxied StreamingResultHandler logic. For ServiceWithStreaming, maybe we can define the method as void chat(String message, StreamingResultHandler handler), and the streaming is processed in the handler. thanks

langchain4j commented 1 year ago

@breezexiao I am not sure I understand your goal. Do you want to stream the function call response (if yes, please explain the reason)? Or you want to get a callback when the function call is triggered by LLM and be able to intrude? Thank you!

breezexiao commented 1 year ago

@breezexiao I am not sure I understand your goal. Do you want to stream the function call response (if yes, please explain the reason)? Or you want to get a callback when the function call is triggered by LLM and be able to intrude? Thank you!

@langchain4j @breezexiao I am not sure I understand your goal. Do you want to stream the function call response (if yes, please explain the reason)? Or you want to get a callback when the function call is triggered by LLM and be able to intrude? Thank you!

@langchain4j em i just want use AIServices then send data to fe use streaming,don't want to block waiting for return, thanks

langchain4j commented 1 year ago

@breezexiao ok got it, will add this option asap (hopefully this week)

langchain4j commented 1 year ago

@breezexiao I have added streaming support to AI Services in 0.18.0:

interface Assistant {

    TokenStream chat(String message);
}

Assistant assistant = AiServices.create(Assistant.class, model);

assistant.chat("Tell me a joke")
    .onNext(token -> ...)
    .onComplete(() -> ...)
    .onError(e -> ...)
    .start();