devoxx / DevoxxGenieIDEAPlugin

DevoxxGenie is a plugin for IntelliJ IDEA that uses local LLM's (Ollama, LMStudio, GPT4All, Llama.cpp and Exo) and Cloud based LLMs to help review, test, explain your project code.
https://devoxx.com
MIT License
137 stars 22 forks source link

Support the Beta Anthropic prompt-caching / cache_control feature #208

Open stephanj opened 4 weeks ago

stephanj commented 4 weeks ago

When added files to the "window context" the should be added to the cache_control, so when follow up questions are asked the context is cached when using Anthropic. As long as Langchain4J doesn't support it, we'll need to extend the LC4J Anthropic base client.

https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching

image

stephanj commented 4 weeks ago

Claude Sonnet 3.5 suggestions:

We'll create extended versions of the Langchain4J classes in the com.devoxx.genie.chatmodel.anthropic package. This approach will allow us to add the new functionality while maintaining compatibility with the existing langchain4j structure.

// 1. Create DevoxxAnthropicCreateMessageRequest.java
package com.devoxx.genie.chatmodel.anthropic;

import dev.langchain4j.model.anthropic.internal.api.AnthropicCreateMessageRequest;
import dev.langchain4j.model.anthropic.internal.api.AnthropicMessage;
import dev.langchain4j.model.anthropic.internal.api.AnthropicTool;
import lombok.Data;
import lombok.EqualsAndHashCode;
import lombok.experimental.SuperBuilder;

import java.util.List;

@Data
@EqualsAndHashCode(callSuper = true)
@SuperBuilder(toBuilder = true)
public class DevoxxAnthropicCreateMessageRequest extends AnthropicCreateMessageRequest {
    private List<DevoxxAnthropicSystemContent> system;

    // Override the getSystem method to return the new system content
    @Override
    public List<DevoxxAnthropicSystemContent> getSystem() {
        return system;
    }

    // Override the setSystem method to accept the new system content
    public void setSystem(List<DevoxxAnthropicSystemContent> system) {
        this.system = system;
    }
}

// 2. Create DevoxxAnthropicSystemContent.java
package com.devoxx.genie.chatmodel.anthropic;

import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import com.fasterxml.jackson.annotation.JsonInclude;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;

import static com.fasterxml.jackson.annotation.JsonInclude.Include.NON_NULL;

@Data
@NoArgsConstructor
@AllArgsConstructor
@Builder
@JsonInclude(NON_NULL)
@JsonIgnoreProperties(ignoreUnknown = true)
public class DevoxxAnthropicSystemContent {
    private String type;
    private String text;
    private DevoxxAnthropicCacheControl cacheControl;
}

// 3. Create DevoxxAnthropicCacheControl.java
package com.devoxx.genie.chatmodel.anthropic;

import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import com.fasterxml.jackson.annotation.JsonInclude;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;

import static com.fasterxml.jackson.annotation.JsonInclude.Include.NON_NULL;

@Data
@NoArgsConstructor
@AllArgsConstructor
@Builder
@JsonInclude(NON_NULL)
@JsonIgnoreProperties(ignoreUnknown = true)
public class DevoxxAnthropicCacheControl {
    private String type;
}

// 4. Create DevoxxAnthropicChatModel.java
package com.devoxx.genie.chatmodel.anthropic;

import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.data.message.ChatMessage;
import dev.langchain4j.model.anthropic.AnthropicChatModel;
import dev.langchain4j.model.output.Response;
import dev.langchain4j.model.anthropic.internal.api.AnthropicCreateMessageRequest;
import dev.langchain4j.model.anthropic.internal.api.AnthropicCreateMessageResponse;
import dev.langchain4j.model.anthropic.internal.client.AnthropicClient;

import java.util.List;

public class DevoxxAnthropicChatModel extends AnthropicChatModel {

    private final List<DevoxxAnthropicSystemContent> system;
    private final AnthropicClient client;

    public static class Builder extends AnthropicChatModel.Builder<DevoxxAnthropicChatModel, Builder> {
        private List<DevoxxAnthropicSystemContent> system;

        public Builder system(List<DevoxxAnthropicSystemContent> system) {
            this.system = system;
            return this;
        }

        @Override
        public DevoxxAnthropicChatModel build() {
            return new DevoxxAnthropicChatModel(this);
        }
    }

    public static Builder builder() {
        return new Builder();
    }

    protected DevoxxAnthropicChatModel(Builder builder) {
        super(builder);
        this.system = builder.system;
        this.client = AnthropicClient.builder()
                .apiKey(builder.apiKey)
                .baseUrl(builder.baseUrl)
                .version(builder.version)
                .beta("prompt-caching-2024-07-31")  // Set the beta header for cache control
                .timeout(builder.timeout)
                .maxRetries(builder.maxRetries)
                .logRequests(builder.logRequests)
                .logResponses(builder.logResponses)
                .build();
    }

    @Override
    public Response<AiMessage> generate(List<ChatMessage> messages) {
        DevoxxAnthropicCreateMessageRequest request = createRequest(messages);
        AnthropicCreateMessageResponse response = client.createMessage(request);
        return processResponse(response);
    }

    private DevoxxAnthropicCreateMessageRequest createRequest(List<ChatMessage> messages) {
        return DevoxxAnthropicCreateMessageRequest.builder()
                .model(getModelName())
                .messages(toAnthropicMessages(sanitizeMessages(messages)))
                .system(system)
                .maxTokens(getMaxTokens())
                .stopSequences(getStopSequences())
                .stream(false)
                .temperature(getTemperature())
                .topP(getTopP())
                .topK(getTopK())
                .build();
    }

    // Implement other necessary methods...
}

// 5. Modify AnthropicChatModelFactory.java
package com.devoxx.genie.chatmodel.anthropic;

// ... existing imports ...

public class AnthropicChatModelFactory implements ChatModelFactory {

    @Override
    public ChatLanguageModel createChatModel(@NotNull ChatModel chatModel) {
        return DevoxxAnthropicChatModel.builder()
            .apiKey(getApiKey())
            .modelName(chatModel.getModelName())
            .temperature(chatModel.getTemperature())
            .topP(chatModel.getTopP())
            .maxTokens(chatModel.getMaxTokens())
            .maxRetries(chatModel.getMaxRetries())
            .system(createSystemContent(chatModel))
            .build();
    }

    private List<DevoxxAnthropicSystemContent> createSystemContent(ChatModel chatModel) {
        // This is an example. Adjust according to your needs.
        return Arrays.asList(
            DevoxxAnthropicSystemContent.builder()
                .type("text")
                .text("You are an AI assistant.")
                .build(),
            DevoxxAnthropicSystemContent.builder()
                .type("text")
                .text("This is cacheable content.")
                .cacheControl(DevoxxAnthropicCacheControl.builder().type("ephemeral").build())
                .build()
        );
    }

    // ... rest of the class ...
}

Here's an explanation of the changes:

  1. DevoxxAnthropicCreateMessageRequest: This class extends AnthropicCreateMessageRequest and overrides the system field to use our new DevoxxAnthropicSystemContent type.

  2. DevoxxAnthropicSystemContent: This class represents the structure of a system message with cache control.

  3. DevoxxAnthropicCacheControl: This class represents the cache control options.

  4. DevoxxAnthropicChatModel: This class extends AnthropicChatModel and overrides the necessary methods to use our new request type and system content structure. It also sets the beta header for cache control.

  5. AnthropicChatModelFactory: This class is modified to use our new DevoxxAnthropicChatModel instead of the original AnthropicChatModel.

To use this new feature:

  1. When creating a chat model through AnthropicChatModelFactory, it will now use the DevoxxAnthropicChatModel which supports cache control.
  2. The createSystemContent method in AnthropicChatModelFactory demonstrates how to create system content with cache control. Adjust this method according to your specific needs.

This approach extends the functionality of the langchain4j classes without modifying them directly. The new classes are placed in the com.devoxx.genie.chatmodel.anthropic package as requested.

You may need to make additional adjustments to ensure full compatibility with your existing code and to handle any new fields in the Anthropic API responses related to cache usage.


Why not use an enum for the value "ephemeral" in "cache_control": {"type": "ephemeral"}

You're absolutely right. Using an enum for the "ephemeral" value is a better practice. It provides type safety, makes the code more maintainable, and prevents potential errors from typos. Let's modify our implementation to use an enum for the cache control type.

// 1. Modify DevoxxAnthropicCacheControl.java
package com.devoxx.genie.chatmodel.anthropic;

import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import com.fasterxml.jackson.annotation.JsonInclude;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;

import static com.fasterxml.jackson.annotation.JsonInclude.Include.NON_NULL;

@Data
@NoArgsConstructor
@AllArgsConstructor
@Builder
@JsonInclude(NON_NULL)
@JsonIgnoreProperties(ignoreUnknown = true)
public class DevoxxAnthropicCacheControl {
    private CacheControlType type;

    public enum CacheControlType {
        EPHEMERAL
    }
}

// 2. Update DevoxxAnthropicChatModel.java (only the relevant part)
package com.devoxx.genie.chatmodel.anthropic;

// ... other imports ...
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.PropertyNamingStrategies;

public class DevoxxAnthropicChatModel extends AnthropicChatModel {

    // ... other fields ...
    private final ObjectMapper objectMapper;

    protected DevoxxAnthropicChatModel(Builder builder) {
        super(builder);
        this.system = builder.system;
        this.client = AnthropicClient.builder()
                .apiKey(builder.apiKey)
                .baseUrl(builder.baseUrl)
                .version(builder.version)
                .beta("prompt-caching-2024-07-31")  // Set the beta header for cache control
                .timeout(builder.timeout)
                .maxRetries(builder.maxRetries)
                .logRequests(builder.logRequests)
                .logResponses(builder.logResponses)
                .build();

        this.objectMapper = new ObjectMapper()
                .setPropertyNamingStrategy(PropertyNamingStrategies.SNAKE_CASE);
    }

    @Override
    public Response<AiMessage> generate(List<ChatMessage> messages) {
        DevoxxAnthropicCreateMessageRequest request = createRequest(messages);
        String jsonRequest = convertRequestToJson(request);
        AnthropicCreateMessageResponse response = client.createMessage(jsonRequest);
        return processResponse(response);
    }

    private String convertRequestToJson(DevoxxAnthropicCreateMessageRequest request) {
        try {
            return objectMapper.writeValueAsString(request);
        } catch (Exception e) {
            throw new RuntimeException("Failed to convert request to JSON", e);
        }
    }

    // ... other methods ...
}

// 3. Update AnthropicChatModelFactory.java (only the relevant part)
package com.devoxx.genie.chatmodel.anthropic;

// ... existing imports ...

public class AnthropicChatModelFactory implements ChatModelFactory {

    // ... other methods ...

    private List<DevoxxAnthropicSystemContent> createSystemContent(ChatModel chatModel) {
        // This is an example. Adjust according to your needs.
        return Arrays.asList(
            DevoxxAnthropicSystemContent.builder()
                .type("text")
                .text("You are an AI assistant.")
                .build(),
            DevoxxAnthropicSystemContent.builder()
                .type("text")
                .text("This is cacheable content.")
                .cacheControl(DevoxxAnthropicCacheControl.builder()
                    .type(DevoxxAnthropicCacheControl.CacheControlType.EPHEMERAL)
                    .build())
                .build()
        );
    }

    // ... rest of the class ...
}

Here's an explanation of the changes:

  1. In DevoxxAnthropicCacheControl, we've replaced the String type with an enum CacheControlType. This enum currently has only one value, EPHEMERAL, but it can be easily extended if more cache control types are added in the future.

  2. In DevoxxAnthropicChatModel, we've added an ObjectMapper configured with snake case naming strategy. We've also modified the generate method to convert our custom request object to JSON before sending it to the Anthropic client. This allows us to use our custom classes while still sending the correct JSON structure to the API.

  3. In AnthropicChatModelFactory, we've updated the createSystemContent method to use the new enum when setting the cache control type.

These changes provide several benefits:

  1. Type Safety: Using an enum prevents accidental typos or invalid values for the cache control type.
  2. Code Completion: IDEs can now provide code completion for the cache control type, making it easier to use.
  3. Maintainability: If Anthropic adds more cache control types in the future, we can easily add them to the enum.
  4. Clear Intent: The enum makes it immediately clear what values are allowed for the cache control type.

Remember to update any other parts of your code that might be creating or using DevoxxAnthropicCacheControl to use the new enum.

This implementation maintains the extension of langchain4j classes without modifying them directly, while improving type safety and maintainability with the use of an enum for cache control type.

stephanj commented 4 weeks ago

There's also a Gemini implementation but ofc implemented in a different way:

https://cloud.google.com/vertex-ai/generative-ai/docs/context-cache/context-cache-create#create-context-cache-sample-drest