Enable prompt deployment in watsonx

andreadimaio commented 1 month ago

Introduction

The idea of this PR is to introduce the possibility to use the concept of "Prompt Deployment" in quarkus-langchain4j for the watsonx module. This is a draft because I would like to discuss with you what I have in mind for the implementation and have your help to understand how it can be improved.

Let's start by giving you some more information about "Prompt Deployment", what it is, and how it can be used. In the watsonx platform there is a possibility to use a component called watsonx.governance. Some specific roles in a team (like AI Engineer and Data Scientist) can work with this platform to improve, test, confront the same prompt with different LLMs and have a lot of information about the different runs to choose the best version to promote.

Once the work it is done, a prompt can be deployed and made available in the watsonx.ai platform using two specific endpoints (different from what we have today):

/deployments/{deploymentId}/text/generation
/deployments/{deploymentId}/text/generation_stream

Each time you want to use a deployed prompt, the only parameters you need to pass are the deploymentId and prompt_variables. This means that for this scenario, an interface annotated with @RegisteredAiService doesn't need the @SystemMessage and @UserMessage because they are stored in watsonx.

How to implement

The idea is to have two new annotations (for the watsonx module only) called @Deployment and @PromptParameter.

@Deployment: This annotation has two different parameters:
- value: Unique identifier of the deployment.
- chatMemory: prompt_variables used for the chatMemory.
@PromptParameter: Annotation used to correctly build the json body.

Examples

Example 1:

@RegisteredAiService
@Deployment("1") // No memory (It would be great to disable the chat memory using the Supplier<ChatMemoryProvider>)
public interface LLMService {

  public String x(String p); 
  // URL: /deployments/1/text/generation 
  // BODY: { "prompt_variables": { "p": ${p} }}

  //or

  public String x(@PromptParameter("mytest") String p);
  // URL: /deployments/1/text/generation 
  // BODY: { "prompt_variables": { "mytest": ${p} }}

  //or

  public String x(@PromptParameter("mytest") @UserMessage String p, @UserMessage String y);
  // URL: /deployments/1/text/generation 
  // BODY: { "prompt_variables": { "mytest": ${p}, "y": ${y}}}
}

Example 2:

@RegisteredAiService
@Deployment(value="2", chatMemory="chat_memory")
public interface LLMService {

  public String x(String p); 
  // URL: /deployments/2/text/generation 
  // BODY: { "prompt_variables": { "p": ${p}, "chat_memory": [UserMessage, AIMessage, UserMessage...] }}
}

Example 3:

@RegisteredAiService
@Deployment("1")
@SystemMessage("...") //   <- Exception thrown because annotation is not allowed
public interface LLMService {

  @SystemMessage("...") // <- Exception thrown because annotation is not allowed
  @UserMessagge("...")  // <- Exception thrown because annotation is not allowed
  public String x(String p); 
}

I started working on this implementation and the code is in the draft... let me know what you think and if it makes sense to continue with this implementation or if I am simply adding complication that is not needed 😆.

andreadimaio commented 1 month ago

I'm rethinking the @PromptParams part and it's not working because I'm trying to map @UserMessage to @PromptParams and those are two different concepts. Maybe there is a solution to fix this behavior (custom implementation of chat memory), but this seems more like a hack to change what is currently done by the core module. I will close this PR because the best way is to try to have something in the langchain4j repository that can help me add this functionality.

geoand commented 1 month ago

I will close this PR because the best way is to try to have something in the langchain4j repository that can help me add this functionality

Makes sense!

quarkiverse / quarkus-langchain4j