uezo / aiavatarkit

🥰 Building AI-based conversational avatars lightning fast ⚡️💬
Apache License 2.0
162 stars 10 forks source link

Add support for ✨Gemini function calling and image input #52

Closed uezo closed 1 month ago

uezo commented 1 month ago

Function calling

# Make function
async def weather(location: str):
    return {"location": location, "weahter": random.choice(["clear", "rain", "cloudy"])}

weather_func = GeminiFunction(
    name="get_current_weather",
    description="Get the current weather in a given location",
    parameters={
        "type": "object",
        "properties": {"location": {"type": "string", "description": "The city name of the location for which to get the weather."}},
    },
    func=weather
)

# Create ChatProcessor with functions
chat_processor_gemini = GeminiProcessor(
    api_key=YOUR_API_KEY,
    functions={"get_current_weather": weather_func}
)

Image input

# Implement `get_image()`
from aiavatar.processors.gemini import GeminiProcessorWithVisionBase

class GeminiProcessorWithVisionScreenShot(GeminiProcessorWithVisionBase):
    async def get_image(self) -> bytes:
        buffered = io.BytesIO()
        image = pyautogui.screenshot(region=(0, 0, 1280, 720))
        image.save(buffered, format="PNG")
        image.save("image_to_gemini.png")   # Save current image for debug
        return buffered.getvalue()

# Create ChatProcessor
chat_processor_gemini = GeminiProcessorWithVisionScreenShot(
    api_key=GOOGLE_API_KEY
)