crewAIInc / crewAI

Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
https://crewai.com
MIT License
20.18k stars 2.79k forks source link

[BUG] dalle_tool demands OPEN_API_KEY setting when using Azure #1422

Open henkbb36org opened 1 week ago

henkbb36org commented 1 week ago

Description

I am trying to use the dalle-tool with an OpenAI implementation on Azure. The Crewai Agents in my script work well, WebsiteSearchTool works well but dalle-tool demands an OPEN_API_KEY setting in the environment (which does not work when I set it to my AzureOpenAI key, it will give invalid API key). I configured the dalle-tool like so: dalle_tool = DallETool( config = { "embedder": { "provider": "azure_openai", "config": { "api_key":os.environ.get("AZURE_OPENAI_KEY"),
"model": "dall-e-3", }, } } ) The same type of config works well for the WebsiteSearchTool. I cannot find any documentation or examples on how to get this working Azure. Any help?

Steps to Reproduce

The Agent is configure like so: illustrator = Agent( role='Illustrator', goal='Generate an image based on…., verbose=True, memory=True, backstory=( "You are an …." ), tools=[dalle_tool], llm=default_llm, allow_delegation=False ) When running my test script it fails with the error: I encountered an error while trying to use the tool. This was the error: Error code: 401 - {'error': {'code': 'invalid_api_key', 'message': 'Incorrect API key provided: 696be861****4a04. You can find your API key at https://platform.openai.com/account/api-keys.', 'param': None, 'type': 'invalid_request_error'}}.

When I don't set any value in the environment variable OPEN_API_KEY it will give the error that the value is not set. Llm is configured using LiteLLM like so: default_llm = LLM( model="azure/gpt-4o", api_key=os.environ.get("AZURE_OPENAI_KEY"), base_url=os.environ.get("AZURE_OPENAI_ENDPOINT"), ) The proper values are in the .env which is loaded by load_dotenv(). Agent functions work well, as well the WebsiteSearchTool

Expected behavior

I hoped the DallETool would to use the azure apenai implementation but I cannot get it to do that.

Screenshots/Code snippets

None

Operating System

macOS Sonoma

Python Version

3.11

crewAI Version

0.67.1

crewAI Tools Version

0.12.1

Virtual Environment

Venv

Evidence

I encountered an error while trying to use the tool. This was the error: Error code: 401 - {'error': {'code': 'invalid_api_key', 'message': 'Incorrect API key provided: 696be861****4a04. You can find your API key at https://platform.openai.com/account/api-keys.', 'param': None, 'type': 'invalid_request_error'}}. Tool Dall-E Tool accepts these inputs: Dall-E Tool(image_description: 'string') - Generates images using OpenAI's Dall-E model.

Possible Solution

None

Additional context

None

henkbb36org commented 4 days ago

Fixed by writing a custom_dalle_tool.py: import json from typing import Type import requests from pydantic import BaseModel from crewai_tools.tools.base_tool import BaseTool from openai import AzureOpenAI

class ImagePromptSchema(BaseModel): """Input for Dall-E Tool."""

image_description: str = "Description of the image to be generated by Dall-E."

class MyCustomDallETool(BaseTool): name: str = "Dall-E Tool" description: str = "Generates images using OpenAI's Dall-E model." args_schema: Type[BaseModel] = ImagePromptSchema

model: str = "dall-e-3"
size: str = "1024x1024"
quality: str = "standard"
n: int = 1

def _run(self, **kwargs) -> str:
    client = AzureOpenAI(
        api_version="YOUR_API_VERSION",
        azure_endpoint=YOUR_AZURE_ENDPOINT,",
        api_key="YOUR_API_KEY",
    )

    image_description = kwargs.get("image_description")

    if not image_description:
        return "Image description is required."

    response = client.images.generate(
        model=self.model, # the name of your DALL-E 3 deployment
        prompt=image_description,
        n=self.n,
    )

    image_data = json.dumps(
        {
            "image_url": response.data[0].url,
            "image_description": response.data[0].revised_prompt,
        }
    )
    response = requests.get(response.data[0].url)
    if response.status_code == 200:
        with open("downloaded_image.jpg", "wb") as file:  # Replace with your desired file name
            file.write(response.content)
        print("Image successfully downloaded and saved as 'downloaded_image.jpg'")
    else:
        print(f"Failed to download image. Status code: {response.status_code}")
    return image_data

Can be used in a from crewai script: import from custom_dalle_tool import MyCustomDallETool dalle_tool = MyCustomDallETool(model="dall-e-3", size="1024x1024", quality="standard", n=1) use tools=[dalle_tool] in Agent definition to call it