Giskard-AI / giskard

🐢 Open-Source Evaluation & Testing for LLMs and ML models
https://docs.giskard.ai
Apache License 2.0
3.73k stars 235 forks source link

Implement a Gemini `LLMClient` #1901

Open kevinmessiaen opened 2 months ago

kevinmessiaen commented 2 months ago

🚀 Feature Request

Create a Gemini client that extends LLMClient.

🔈 Motivation

This will allow to run the scan and tests using Gemini models

🛰 Alternatives

N/A

📎 Additional context

Example can be seen with mistral and openai.

marouanetalaa commented 1 month ago

Hello, is this implementation still needed ? I would like to contribute to this issue

kevinmessiaen commented 1 month ago

Hello @marouanetalaa ,

yes we still need this implementation. It would be appreciated if you can contribute to this issue!

I assigned you to it, let me know if you need help or information.

sudharshanavp commented 6 days ago

Hey, I just tried replicating the changes in the pull request on my system and set my gemini api key and tried to run a scan.

I get this for all the LLM Assisted Detectors

2024-07-02 22:16:07,754 pid:17976 MainThread giskard.scanner.logger INFO     LLMStereotypesDetector: Generating test case requirements
2024-07-02 22:16:07,756 pid:17976 MainThread root         WARNING  Unsupported format 'json', ignoring.
2024-07-02 22:16:07,758 pid:17976 MainThread giskard.scanner.logger ERROR    Detector LLMStereotypesDetector failed with error: GenerationConfig.__init__() got an unexpected keyword argument 'seed' 

I have added these lines of code to setup the client

import google.generativeai as genai
from giskard.llm.client.gemini import GeminiClient

genai.configure(api_key=os.environ["GEMINI_API_KEY"])
giskard.llm.set_default_client(GeminiClient())
Khaliq88 commented 6 days ago

import google.generativeai as genai from giskard.llm.client.gemini import GeminiClient

genai.configure(api_key=os.environ["GEMINI_API_KEY"]) giskard.llm.set_default_client(GeminiClient())

kevinmessiaen commented 6 days ago

Hey, I just tried replicating the changes in the pull request on my system and set my gemini api key and tried to run a scan.

I get this for all the LLM Assisted Detectors

2024-07-02 22:16:07,754 pid:17976 MainThread giskard.scanner.logger INFO     LLMStereotypesDetector: Generating test case requirements
2024-07-02 22:16:07,756 pid:17976 MainThread root         WARNING  Unsupported format 'json', ignoring.
2024-07-02 22:16:07,758 pid:17976 MainThread giskard.scanner.logger ERROR    Detector LLMStereotypesDetector failed with error: GenerationConfig.__init__() got an unexpected keyword argument 'seed' 

I have added these lines of code to setup the client

import google.generativeai as genai
from giskard.llm.client.gemini import GeminiClient

genai.configure(api_key=os.environ["GEMINI_API_KEY"])
giskard.llm.set_default_client(GeminiClient())

Thanks for reporting the issue, I corrected this inside of #1975

sudharshanavp commented 5 days ago

Hey, now I'm having this issue

2024-07-03 20:40:01,087 pid:11428 MainThread giskard.scanner.logger INFO     Running detectors: ['LLMBasicSycophancyDetector', 'LLMCharsInjectionDetector', 'LLMHarmfulContentDetector', 'LLMImplausibleOutputDetector', 'LLMInformationDisclosureDetector', 'LLMOutputFormattingDetector', 'LLMPromptInjectionDetector', 'LLMStereotypesDetector', 'LLMFaithfulnessDetector']
Running detector LLMBasicSycophancyDetector…
2024-07-03 20:40:01,283 pid:11428 MainThread root         WARNING  Unsupported format 'json', ignoring.
2024-07-03 20:40:03,589 pid:11428 MainThread giskard.scanner.logger ERROR    Detector LLMBasicSycophancyDetector failed with error: 400 Please use a valid role: user, model.
Traceback (most recent call last):
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\giskard\scanner\scanner.py", line 152, in _run_detectors
    detected_issues = detector.run(model, dataset, features=features)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\giskard\scanner\llm\llm_basic_sycophancy_detector.py", line 85, in run
    dataset1, dataset2 = generator.generate_dataset(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\giskard\llm\generators\sycophancy.py", line 101, in generate_dataset
    out = self.llm_client.complete(
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\giskard\llm\client\gemini.py", line 75, in complete
    contents=_format(messages),
             ^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\generativeai\generative_models.py", line 331, in generate_content
    response = self._client.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\ai\generativelanguage_v1beta\services\generative_service\client.py", line 827, in generate_content
    response = rpc(
               ^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\gapic_v1\method.py", line 131, in __call__
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\retry\retry_unary.py", line 293, in retry_wrapped_func
    return retry_target(
           ^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\retry\retry_unary.py", line 153, in retry_target
    _retry_error_helper(
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\retry\retry_base.py", line 212, in _retry_error_helper
    raise final_exc from source_exc
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\retry\retry_unary.py", line 144, in retry_target
    result = target()
             ^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\timeout.py", line 120, in func_with_timeout
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\grpc_helpers.py", line 78, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InvalidArgument: 400 Please use a valid role: user, model.
LLMBasicSycophancyDetector: 0 issue detected. (Took 0:00:02.711737)

I think Gemini only accepts user or model as role.

Also https://ai.google.dev/gemini-api/docs/system-instructions?lang=python is a thing. How would we ideally use this within the GeminiClient?

kevinmessiaen commented 5 days ago

Hey, now I'm having this issue

2024-07-03 20:40:01,087 pid:11428 MainThread giskard.scanner.logger INFO     Running detectors: ['LLMBasicSycophancyDetector', 'LLMCharsInjectionDetector', 'LLMHarmfulContentDetector', 'LLMImplausibleOutputDetector', 'LLMInformationDisclosureDetector', 'LLMOutputFormattingDetector', 'LLMPromptInjectionDetector', 'LLMStereotypesDetector', 'LLMFaithfulnessDetector']
Running detector LLMBasicSycophancyDetector…
2024-07-03 20:40:01,283 pid:11428 MainThread root         WARNING  Unsupported format 'json', ignoring.
2024-07-03 20:40:03,589 pid:11428 MainThread giskard.scanner.logger ERROR    Detector LLMBasicSycophancyDetector failed with error: 400 Please use a valid role: user, model.
Traceback (most recent call last):
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\giskard\scanner\scanner.py", line 152, in _run_detectors
    detected_issues = detector.run(model, dataset, features=features)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\giskard\scanner\llm\llm_basic_sycophancy_detector.py", line 85, in run
    dataset1, dataset2 = generator.generate_dataset(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\giskard\llm\generators\sycophancy.py", line 101, in generate_dataset
    out = self.llm_client.complete(
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\giskard\llm\client\gemini.py", line 75, in complete
    contents=_format(messages),
             ^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\generativeai\generative_models.py", line 331, in generate_content
    response = self._client.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\ai\generativelanguage_v1beta\services\generative_service\client.py", line 827, in generate_content
    response = rpc(
               ^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\gapic_v1\method.py", line 131, in __call__
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\retry\retry_unary.py", line 293, in retry_wrapped_func
    return retry_target(
           ^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\retry\retry_unary.py", line 153, in retry_target
    _retry_error_helper(
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\retry\retry_base.py", line 212, in _retry_error_helper
    raise final_exc from source_exc
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\retry\retry_unary.py", line 144, in retry_target
    result = target()
             ^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\timeout.py", line 120, in func_with_timeout
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "e:\Work\1. Projects\giskard_demo\.venv\Lib\site-packages\google\api_core\grpc_helpers.py", line 78, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InvalidArgument: 400 Please use a valid role: user, model.
LLMBasicSycophancyDetector: 0 issue detected. (Took 0:00:02.711737)

I think Gemini only accepts user or model as role.

Also https://ai.google.dev/gemini-api/docs/system-instructions?lang=python is a thing. How would we ideally use this within the GeminiClient?

You're right, it will be more robust to switch to the system_instruction. However since we ask the user to provide the GenerativeModel we don't have access to it, I'll take a look on how to implement this properly.

I also comitted a fix for the role name, it was implemented but somehow the variable used in the end was the initial one