[FEATURE] Support for HttpConnector request and response body transformations through scripts

austintlee commented 1 year ago

Is your feature request related to a problem? Client applications interacting with various http endpoints through remote models + http connectors need to be made vendor-aware as each LLM vendor, e.g. OpenAI, Cohere and AWS (Bedrock) defines their own inputs and outputs for their inference API. This will lead to every client application having the same vendor-specific logic.

What solution would you like? At a minimum, it would be good to have a way to map vendor specific parameters to a common set of parameters that client applications can use.

For cases that fall outside the common set of parameters that work across multiple vendors (those that cover 90% of the use cases we know of today), we can provide support for Mustache or Painless scripts to allow users to customize how inputs are prepared before being sent to LLMs and how outputs are presented back to the calling application. This can be used to tweak prompts and transform responses.

What alternatives have you considered? A clear and concise description of any alternative solutions or features you've considered.

Do you have any additional context?

austintlee commented 1 year ago

OpenAI vs Bedrock

OpenAI Input

{
    "model": "gpt-3.5-turbo",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }

Output

"choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "\n\nHello there, how may I assist you today?",
    },
    "finish_reason": "stop"
  }]

Bedrock (based on what I see in the blueprint) Input

{
  "prompt": "\\n\\nHuman: this is my question\\n\\nAssistant:"
}

Output

{
  "completion": "this is your answer."
}

krrishdholakia commented 1 year ago

Hey @austintlee I believe LiteLLM can help here.

consistent i/o

We simplify these LLM API calls by translating from the OpenAI format to provider-specific formats.

Here's our current openai param mapping (missing coverage occurs if the provider just doesn't offer an equivalent param):

We also guarantee a consistent input/output format:

from litellm import completion
import os

## set ENV variables 
os.environ["OPENAI_API_KEY"] = "your-openai-key" 
os.environ["COHERE_API_KEY"] = "your-cohere-key" 

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# cohere call
response = completion(model="command-nightly", messages=messages)
print(response)

With a guaranteed consistent output, text responses will always be available at ['choices'][0]['message']['content']

bedrock

import os 
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
            model="anthropic.claude-instant-v1", 
            messages=[{ "content": "Hello, how are you?","role": "user"}]
)

https://docs.litellm.ai/docs/providers/bedrock

austintlee commented 1 year ago

@krrishdholakia Thanks for bringing this to our attention. Let's move this discussion over to #1495.

dylan-tong-aws commented 10 months ago

@austintlee, can you provide specific examples of when painless scripts lacks the utilities you require to perform request/response data transformations?

What is the model or AI service / API you were trying to integrate with?
What is the specific function missing and can you refer to equivalent functions in other langauges?

Are there specific tensor transformation functions that you need? For instance, are there specific functions available in these tools that you need in OpenSearch:

austintlee commented 10 months ago

You can find the detail here -> #1990

ylwu-amzn commented 10 months ago

@austintlee With this PR https://github.com/opensearch-project/ml-commons/pull/1954, we can support pre/post process function on any type of data from 2.12, not just text docs input data.

BTW, I replied https://github.com/opensearch-project/ml-commons/issues/1990#issuecomment-1928167932 with the correct post process function

opensearch-project / ml-commons

[FEATURE] Support for HttpConnector request and response body transformations through scripts #1475

consistent i/o

bedrock