uezo / aiavatarkit

🥰 Building AI-based conversational avatars lightning fast ⚡️💬
Apache License 2.0
162 stars 10 forks source link

Add support for ChatGPT high-performance vision input👀 #55

Closed uezo closed 1 month ago

uezo commented 1 month ago

Add high-performance and dynamic vision control feature. To use vision, implement get_image() and configure ChatGPTProcessor.

import io
import pyautogui
from aiavatar.processors.chatgpt import ChatGPTProcessor

# Implement get_image
async def get_image(source: str=None) -> bytes:
    buffered = io.BytesIO()
    image = pyautogui.screenshot(region=(0, 0, 1280, 720))
    image.save(buffered, format="PNG")
    image.save("image_to_gemini.png")   # Save current image for debug
    return buffered.getvalue()

# Configure ChatGPTProcessor
chat_processor_gpt = ChatGPTProcessor(
    api_key=OPENAI_API_KEY,
    model="gpt-4o",
    system_message_content="ユーザーからの要求を処理するために画像データが必要な場合、[vision:screenshot]を応答メッセージに含めてください。\n\n例\n[vision:screenshot]承知しました。画像を確認しています。"
)
chat_processor_gpt.use_vision = True
chat_processor_gpt.get_image = get_image