quarkiverse / quarkus-langchain4j

Quarkus Langchain4j extension
https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html
Apache License 2.0
149 stars 89 forks source link

Support Multi<ChatCompletionResponse> AI services #828

Open dastrobu opened 3 months ago

dastrobu commented 3 months ago

When declaring an AI service with signature:

public interface AiService {
    @SystemMessage("You are a professional poet")
    @UserMessage("""
            Write a poem about {topic}. The poem should be {lines} lines long.
        """)
    Multi<ChatCompletionResponse> writeAPoem(String topic, int lines);
}

I get

dev.langchain4j.exception.IllegalConfigurationException: Only Multi<String> is supported as a Multi return type. Offending method is 'fooAiService#writeAPoem'
    at dev.langchain4j.exception.IllegalConfigurationException.illegalConfiguration(IllegalConfigurationException.java:12)
    at io.quarkiverse.langchain4j.deployment.AiServicesProcessor.handleDeclarativeServices(AiServicesProcessor.java:442)

To be able to access the metadata of the response, such as usage or finishReason it would be necessary to access the underlying response objects.

geoand commented 3 months ago

We can probably add support for that.

geoand commented 3 months ago

To be able to access the metadata of the response, such as usage or finishReason it would be necessary to access the underlying response objects.

How do you plan to use these in a streaming fashion (as Multi implies)?

dastrobu commented 3 months ago

@geoand you would check on the finish_reason of the last event before the [DONE] event.

Here is an example of a raw api call (with max_tokens set to 1):

data: {"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"delta":{"content":"It"},"finish_reason":null,"index":0,"logprobs":null}],"created":1724759298,"id":"chatcmpl-A0oyIz0Qfd9Hq22NwN3VyT7FtgCzb","model":"gpt-4o-2024-05-13","object":"chat.completion.chunk","system_fingerprint":"fp_abc28019ad"}

data: {"choices":[{"content_filter_results":{},"delta":{},"finish_reason":"length","index":0,"logprobs":null}],"created":1724759298,"id":"chatcmpl-A0oyIz0Qfd9Hq22NwN3VyT7FtgCzb","model":"gpt-4o-2024-05-13","object":"chat.completion.chunk","system_fingerprint":"fp_abc28019ad"}

data: [DONE]

As you can see, the last event shows "finish_reason":"length", while previous events have "finish_reason":null.

geoand commented 3 months ago

Gotcha, thanks!

geoand commented 2 months ago

@jmartisk do you have any spare cycles to look into this?

My guess is that it shouldn't take more than a couple hours for someone who knows the codebase :)