Giskard-AI / giskard

🐒 Open-Source Evaluation & Testing for LLMs and ML models
https://docs.giskard.ai
Apache License 2.0
3.87k stars 248 forks source link

Feature: Amazon Bedrock Integration #1590

Closed arm-diaz closed 4 months ago

arm-diaz commented 9 months ago

πŸš€ Feature Request

Users interested in running Giskard with Amazon Bedrock

πŸ”ˆ Motivation

Integrating Giskard with Amazon Bedrock would allow users to evaluate and provide more visibility on the LLM performance.

πŸ›° Question?

I would to hear more about the technical challenges or limitations that could prevent integrating Amazon Bedrock with Giskard. My goal is to get a better sense of the blocking factors, either from Bedrock API, Giskard's design, or simply integration complexity.

mattbit commented 9 months ago

Hi @arm-diaz, you should be able to use Bedrock without problems. Check how to wrap your model on this page . If you are using Bedrock with langchain, check the "Wrap a LangChain object" tab, otherwise if you are using the SDK you can wrap your model as described in " Wrap a stand-alone LLM".

Feel free to reopen the issue if you run into problems or have doubts. Also, don’t hesitate to contact us on the Discord support channel.

mattbit commented 9 months ago

EDIT: I misunderstood the request. The question was about running the LLM-assisted detector replacing GPT-4 with a Bedrock model.

luca-martial commented 4 months ago

This was released in https://github.com/Giskard-AI/giskard/releases/tag/v2.10.0 βœ…

Edit: this brings support for custom LLM clients, so you're able to integrate Bedrock by creating your class. Native support is not yet implemented, but this solves the issue you mentioned

mattbit commented 4 months ago

This was released in https://github.com/Giskard-AI/giskard/releases/tag/v2.10.0 βœ…

Not yet

kevinmessiaen commented 4 months ago

@arm-diaz

We added support to custom LLM and added integrated mistral. In order to integrate Bedrock we need create a class that implement LLMClient.

Example can be seen with mistral and openai.

mattbit commented 4 months ago

I haven't tested at all but something like this should work (for Claude 2):

class ClaudeBedrockClient(LLMClient):
    def __init__(self, bedrock_client):
        self._client = bedrock_client

    def complete(
            self,
            messages: Sequence[ChatMessage],
            temperature: float = 1,
            max_tokens: Optional[int] = None,
            caller_id: Optional[str] = None,
            seed: Optional[int] = None,
            format=None,
    ) -> ChatMessage:
        # Create the prompt
        prompt = ""
        for msg in messages:
            if msg.role.lower() == "assistant":
                prefix = "\n\nAssistant: "
            else:
                prefix = "\n\nHuman: "

            prompt += prefix + msg.content

        prompt += "\n\nAssistant: "

        # Create the body
        params = {
            "prompt": prompt,
            "max_tokens_to_sample": max_tokens or 1000,
            "temperature": temperature,
            "top_p": 0.9,
        }
        body = json.dumps(params)

        response = self._client.invoke_model(
            body=body,
            modelId=self._model_id,
            accept="application/json",
            contentType="application/json",
        )
        data = json.loads(response.get("body").read())

        return ChatMessage(role="assistant", message=data["completion"])
celmore25 commented 4 months ago

Played around with this today. Got some good results with Haiku on a test dataset. Just started with only Claude3 versions as well as Titan embeddings. Will send in a PR in a little bit...

bedrock llm client

from typing import Optional, Sequence
import json

from ..config import LLMConfigurationError
from ..errors import LLMImportError
from . import LLMClient
from .base import ChatMessage

try:
    import boto3  # noqa: F401
except ImportError as err:
    raise LLMImportError(
        flavor="llm", msg="To use Bedrock models, please install the `boto3` package with `pip install boto3`"
    ) from err

class ClaudeBedrockClient(LLMClient):
    def __init__(self,
                 bedrock_runtime_client,
                 model: str = "anthropic.claude-3-sonnet-20240229-v1:0",
                 anthropic_version: str = "bedrock-2023-05-31"):
        self._client = bedrock_runtime_client
        self.model = model
        self.anthropic_version = anthropic_version

    def complete(
            self,
            messages: Sequence[ChatMessage],
            temperature: float = 1,
            max_tokens: Optional[int] = 1000,
            caller_id: Optional[str] = None,
    ) -> ChatMessage:

        # only supporting claude 3 to start
        if 'claude-3' not in self.model:
            raise LLMConfigurationError(
                f"Only claude-3 models are supported as of now, got {self.model}"
            )

        # extract system prompt from messages
        system_prompt = ""
        if len(messages) > 1:
            if messages[0].role.lower() == "user" and messages[1].role.lower() == "user":
                system_prompt = messages[0].content
                messages = messages[1:]

        # Create the messages format needed for bedrock specifically
        input_msg_prompt = []
        for msg in messages:
            if msg.role.lower() == "assistant":
                input_msg_prompt.append(
                    {
                        "role": "assistant",
                        "content": [
                            {
                                "type": "text",
                                "text": msg.content
                            }
                        ]
                    }
                )
            else:
                input_msg_prompt.append(
                    {
                        "role": "user",
                        "content": [
                            {
                                "type": "text",
                                "text": msg.content
                            }
                        ]
                    }
                )

        # create the json body to send to the API
        body = json.dumps({
            "anthropic_version": "bedrock-2023-05-31",
            "max_tokens": max_tokens,
            "temperature": temperature,
            "system": system_prompt,
            "messages": input_msg_prompt
        })

        # invoke the model and get the response
        try:
            accept = 'application/json'
            contentType = 'application/json'
            response = self._client.invoke_model(
                body=body, modelId=self.model, accept=accept, contentType=contentType
            )
            completion = json.loads(response.get('body').read())
        except RuntimeError as err:
            raise LLMConfigurationError("Could not get response from Bedrock API") from err

        self.logger.log_call(
            prompt_tokens=completion['usage']['input_tokens'],
            sampled_tokens=completion['usage']['input_tokens'],
            model=self.model,
            client_class=self.__class__.__name__,
            caller_id=caller_id,
        )

        msg = completion['content'][0]['text']
        return ChatMessage(role="assistant", content=msg)

bedrock embedding model

from typing import Sequence

import numpy as np
import json

from .base import BaseEmbedding

class BedrockEmbedding(BaseEmbedding):
    def __init__(self, client, model: str):
        """
        Parameters
        ----------
        client : Bedrock
            boto3 based Bedrock runtime client instance.
        model : str
            Model name.
        """
        self.model = model
        self.client = client

    def embed(self, texts: Sequence[str]) -> np.ndarray:

        if 'titan' not in self.model:
            raise ValueError(f"Only titan embedding models are supported currently, got {self.model} instead")

        if isinstance(texts, str):
            texts = [texts]

        accept = "application/json"
        contentType = "application/json"
        embeddings = []
        for text in texts:
            body = json.dumps({"inputText": text})
            response = self.client.invoke_model(
                body=body, modelId=self.model, accept=accept, contentType=contentType
            )
            response_body = json.loads(response.get("body").read())
            embedding = response_body.get("embedding")
            embeddings.append(embedding)

        return np.array(embeddings)

end to end test

import os
import boto3
import pandas as pd

import giskard
from giskard.llm.client.bedrock import ClaudeBedrockClient
from giskard.llm.embeddings.bedrock import BedrockEmbedding
from giskard.rag import generate_testset, KnowledgeBase
from giskard.rag import QATestset

# setup the bedrock client and embedding model
bedrock_runtime = boto3.client("bedrock-runtime", region_name=os.environ["AWS_DEFAULT_REGION"])
oc = ClaudeBedrockClient(bedrock_runtime, model="anthropic.claude-3-haiku-20240307-v1:0")
embedding_model = BedrockEmbedding(bedrock_runtime, model="amazon.titan-embed-text-v1")
giskard.llm.set_default_client(oc)

# Load your data and initialize the KnowledgeBase
df = pd.read_csv("test_faqs.csv")
knowledge_base = KnowledgeBase(
    data=df,
    embedding_model=embedding_model,
)

# Generate a testset with 10 questions & answers for each question types (this will take a while)
testset = generate_testset(
    knowledge_base, 
    num_questions=60,
    language='en',  # optional, we'll auto detect if not provided
    agent_description="A customer support chatbot for Amazon SageMaker", # helps generating better questions
)

# Save the generated testset
testset.save("my_testset.jsonl")

# You can easily load it back
loaded_testset = QATestset.load("my_testset.jsonl")

# Convert it to a pandas dataframe
testset_df = loaded_testset.to_pandas()