Giskard-AI / giskard

🐢 Open-Source Evaluation & Testing for ML models & LLMs
https://docs.giskard.ai
Apache License 2.0
3.92k stars 249 forks source link

Wrapping custom application endpoint not working #1914

Closed AshishGoelTR closed 4 months ago

AshishGoelTR commented 4 months ago

Issue Type

Bug

Source

source

Giskard Library Version

2.11.0

Giskard Hub Version

N/A

OS Platform and Distribution

No response

Python version

3.11.4

Installed python packages

No response

Current Behaviour?

Testing the custom application endpoint that is using AzureOpenAI in the backend is failing using the code snippet provided by giskard to wrap the custom endpoint.

The application endpoint does not require any API key at this time and the giskard.scan requires to pass an OPENAI_API_KEY. If a dummy key is set as environment variable, the code fails with API connection error.

Standalone code OR list down the steps to reproduce the issue

import os
import giskard
from giskard.llm import set_llm_model

def call_my_api():
  return [requests.post('https://api-url.com/v1/chatbot/request', json=json_data, headers='Content-type': 'application/json'})]

set_llm_model("gpt4")

# Create a giskard.Model object. Don’t forget to fill the `name` and `description`
giskard_model = giskard.Model(
    call_my_api,  # our langchain.LLMChain object
    model_type="text_generation",
    name="My Generic Assistant",
    description="A generic assistant that kindly answers questions.",
    feature_names=["messages"],
)

scan_results = giskard.scan(giskard_model)
display(scan_results)  # in your notebook

Relevant log output

Error:

---------------------------------------------------------------------------
APIConnectionError                        Traceback (most recent call last)
Cell In[22], line 1
----> 1 scan_results = giskard.scan(giskard_model)
      2 display(scan_results)  # in your notebook

File /opt/homebrew/lib/python3.11/site-packages/giskard/scanner/__init__.py:64, in scan(model, dataset, features, params, only, verbose, raise_exceptions)
     35 """Automatically detects model vulnerabilities.
     36 
     37 See :class:`Scanner` for more details.
   (...)
     61     A scan report object containing the results of the scan.
     62 """
     63 scanner = Scanner(params, only=only)
---> 64 return scanner.analyze(
     65     model, dataset=dataset, features=features, verbose=verbose, raise_exceptions=raise_exceptions
     66 )

File /opt/homebrew/lib/python3.11/site-packages/giskard/scanner/scanner.py:100, in Scanner.analyze(self, model, dataset, features, verbose, raise_exceptions)
     77 """Runs the analysis of a model and dataset, detecting issues.
     78 
     79 Parameters
   (...)
     96     A report object containing the detected issues and other information.
     97 """
...
    988     'HTTP Request: %s %s "%i %s"', request.method, request.url, response.status_code, response.reason_phrase
    989 )
    991 try:

APIConnectionError: Connection error.
kevinmessiaen commented 4 months ago

Hello,

In this case it seems that you only set up the call to https://api-url.com/v1/chatbot/request for the model to be tested. However the scan will be calling OpenAI. The log that you're showing does not shows exactly where the error comes from but based on your description it comes from the scan and not the model. The call to your custom url seems fine to me.

If you want to integrate calls to your custom URL for the scan you need to use the LLMClient:

import os
from dataclasses import asdict

from typing import Sequence, Optional

import giskard
from giskard.llm import set_default_client
from giskard.llm.client import LLMClient, ChatMessage
import requests
import pandas as pd

class MyApiClient(LLMClient):
    def complete(
            self,
            messages: Sequence[ChatMessage],
            temperature: float = 1,
            max_tokens: Optional[int] = None,
            caller_id: Optional[str] = None,
            seed: Optional[int] = None,
            format=None,
    ) -> ChatMessage:
        # In here I assume that your API have the same format as OpenAI, adjust based on your needs
        completion = requests.post('https://api-url.com/v1/chatbot/request', json={
            'model': self.model,
            'messages': [asdict(m) for m in messages],
            'temperature': temperature,
            'max_tokens': max_tokens,
            'seed': seed,
            'response_format': format
        }, headers={'Content-type': 'application/json'}).json()

        self.logger.log_call(
            prompt_tokens=completion['usage']['prompt_tokens'],
            sampled_tokens=completion['usage']['completion_tokens'],
            model='gpt-4',
            client_class=self.__class__.__name__,
            caller_id=caller_id,
        )

        msg = completion.choices[0].message

        return ChatMessage(role=msg.role, content=msg.content)

my_api_client = MyApiClient()

set_default_client(my_api_client)

def call_my_api(df: pd.DataFrame):
    return [my_api_client.complete(message).content for message in df['messages']]

# Create a giskard.Model object. Don’t forget to fill the `name` and `description`
giskard_model = giskard.Model(
    call_my_api,  # our langchain.LLMChain object
    model_type="text_generation",
    name="My Generic Assistant",
    description="A generic assistant that kindly answers questions.",
    feature_names=["messages"],
)

scan_results = giskard.scan(giskard_model)
display(scan_results)  # in your notebook
AshishGoelTR commented 4 months ago

Thanks @kevinmessiaen. we figured out the model wrapping issue, and the scan initiates but fails with the error below:

image

We are using the Azure OpenAI LLM for the detector.