quarkiverse / quarkus-langchain4j

Quarkus Langchain4j extension
https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html
Apache License 2.0
139 stars 82 forks source link

Forcing `response_format` to json #983

Open andreadimaio opened 3 days ago

andreadimaio commented 3 days ago

The ChatLanguageModel interface provides a new method that can be implemented to force the use of response_format to json when an AiService method returns a pojo. This is something that can be done automatically by quarkus.

@Experimental
default ChatResponse chat(ChatRequest request) {
    throw new UnsupportedOperationException();
}

This should be a simple change to the AiServiceMethodImplementationSupport class, but all current providers will need to be updated to manage this new method.

Does this make sense?

geoand commented 3 days ago

I think it does

geoand commented 11 hours ago

cc @langchain4j

geoand commented 11 hours ago

@andreadimaio do you want to work on this?

andreadimaio commented 10 hours ago

@andreadimaio do you want to work on this?

Yes, I'll open a new PR

geoand commented 10 hours ago

🙏🏽

andreadimaio commented 8 hours ago

The implementation is a little more complex than what I have in mind, for the simple reason that OpenAi also has another response_format option called json_schema (watsonx.ai only supports json_object).

If the json_schema option is enabled, the API will also take the schema of the object as input. In this case it is useful to create this schema at build time.

andreadimaio commented 7 hours ago

I'm not an expert on the OpenAi APIs, but I think that if response_format is equal to json_schema, it makes no sense to inject the message You must answer strictly in the following JSON format: ... into the prompt, because OpenAi will do something to make sure this happens. This message can be injected for all other types TEXT and JSON_OBJECT, but this is a detail that for now can be overlooked.

langchain4j commented 7 hours ago

@andreadimaio it works like this in vanilla LC4j. If schema can be passed, we do not append extra instructions

langchain4j commented 7 hours ago

But schema is now supported only by OpenAI and Gemini

andreadimaio commented 3 hours ago

There's something that's not clear to me. Looking at class DefaultAiServices.java there are these lines:

Response < AiMessage > response;
if (supportsJsonSchema && jsonSchema.isPresent()) {
    ChatRequest chatRequest = ChatRequest.builder()
        .messages(messages)
        .toolSpecifications(toolSpecifications)
        .responseFormat(ResponseFormat.builder()
            .type(JSON)
            .jsonSchema(jsonSchema.get())
            .build())
        .build();

    ChatResponse chatResponse = context.chatModel.chat(chatRequest);

    response = new Response < > (
        chatResponse.aiMessage(),
        chatResponse.tokenUsage(),
        chatResponse.finishReason()
    );
} else {
    // TODO migrate to new API
    response = toolSpecifications == null ?
        context.chatModel.generate(messages) :
        context.chatModel.generate(messages, toolSpecifications);
}

The chat method is invoked only when the provider supports the json_schema but what about the json_object? I would like to force the use of the chat method even in this case. Maybe the class Capability should contain also this type.

Another note is about the default implementation of the chat method. It has all the parameters to call the generate method if the provider doesn't support the response_format. Isn't it better to have this kind of default implementation instead of throwing an exception? In this case it should be easier to handle the chat method for all model providers (maybe I'm missing something?!).

@langchain4j

andreadimaio commented 3 hours ago

Or your idea is to use RESPONSE_FORMAT_JSON_SCHEMA for both values (json_object, json_schema)? Maybe yes, because in the end the logic inside the chat method can handle the variable passed to make the correct call to the endpoint.

langchain4j commented 3 hours ago

I was planning to add another Capability for json_object. It should be easy as we know which providers support Json mode.

Regarding the default implementation of the chat method, you're right, it should call the generate methods. I actually implemented it this way initially, but then rolled back because I had some doubts about it. This is work in progress, I plan to get back to this new API soon. Eventually generate methods will be deprecated and providers will need to implement only one method: chat.

langchain4j commented 3 hours ago

Chat method is used only when Json capability is present because I had to rush this new chat API in order to enable structured outputs. Otherwise there was no way to pass the schema. WIP...

andreadimaio commented 3 hours ago

I was planning to add another Capability for json_object. It should be easy as we know which providers support Json mode.

Regarding the default implementation of the chat method, you're right, it should call the generate methods. I actually implemented it this way initially, but then rolled back because I had some doubts about it. This is work in progress, I plan to get back to this new API soon. Eventually generate methods will be deprecated and providers will need to implement only one method: chat.

Chat method is used only when Json capability is present because I had to rush this new chat API in order to enable structured outputs. Otherwise there was no way to pass the schema. WIP...

Thank you!

@geoand what do you suggest to do regarding the implementation of this functionality in quarkus-langchain4j? I could go ahead and implement what is there today, or wait for a new release.