[FEATURE] Anthropic Pre-fill JSON

ryancriswell commented 2 months ago

Is your feature request related to a problem? Please describe. Anthropic Claude models are likely to include a preamble of some filler text before returning a JSON object, even when prompted otherwise.

Describe the solution you'd like

Anthropic recommends pre-filling the Assistant message with "{" to skip the preamble and go straight to generating the JSON object.

Describe alternatives you've considered

Issue 1446 is now extracting the JSON block which improved the functionality, but there remains token waste that could be avoided if the preamble is skipped.

Additional context

langchain4j commented 2 months ago

Hi @ryancriswell, would you like to implement this?

ryancriswell commented 2 months ago

Yeah, I can start working on it today.

langchain4j commented 2 months ago

@ryancriswell great! Do you have a plan on how to implement it already?

ryancriswell commented 2 months ago

Somewhat, my initial thought is to do this in AnthropicChatModel.generate.

Add the prefill AiMessage before sending the AnthropicCreateMessageRequest
After getting the AnthropicCreateMessageReponse, alter it to include the "{" since the prefill is a separate message.

The AnthropicStreamingChatModel version would look a little different. Any thoughts or feedback?

langchain4j commented 2 months ago

@ryancriswell sounds good! I would add a property (e.g. assistantMessagePrefix) to AnthropicChatModel so that users could configure what exactly prefix should be.

For AnthropicStreamingChatModel, I guess we could call handler.onNext(assistantMessagePrefix) before starting real streaming?

langchain4j commented 2 months ago

@ryancriswell FYI there is a proposal in https://github.com/quarkiverse/quarkus-langchain4j/issues/905 to introduce a new annotation for AI Services: @AiMessagePrefix, I think this can be useful for your use case and should work for all model providers that support this feature

langchain4j / langchain4j

[FEATURE] Anthropic Pre-fill JSON #1742