phidatahq / phidata

Build AI Agents with memory, knowledge, tools and reasoning. Chat with them using a beautiful Agent UI.
https://docs.phidata.com
Mozilla Public License 2.0
15.58k stars 2.14k forks source link

Feature Proposal: Support Passing Images as Base64 in `agent.print_response` [ Official Standard Way ] #1460

Open MANISH007700 opened 1 week ago

MANISH007700 commented 1 week ago

Description

Currently, the method add_images_to_message_content supports adding images to message content through URLs.

The goal is to make it easier to directly pass base64-encoded image strings as input without requiring prior manual validation or transformations. This would streamline workflows for users who already work with base64 images in their pipelines.

Proposed Changes Update the agent.print_response functionality to explicitly accept and process base64-encoded image strings. Modify the logic in add_images_to_message_content to prioritize and validate base64 strings more robustly.

PREVIEW CODE

agent.print_response(
    [
        {"type": "text", "text": "What's in this image, describe in 1 sentence"},
        {
            "type": "image_url",
            "image_url": f"data:image/jpeg;base64,{base64_string}",
        },
    ]
)

Lmk if you want me to pick this up and will get it done.

cc : @manthanguptaa @ashpreetbedi

ysolanky commented 5 days ago

@MANISH007700 that is a great catch! Please feel free to take this up. I look forward to reviewing your PR! Thank you!

MANISH007700 commented 2 days ago

@ysolanky can you check this issue first - Viz Model Hallucination I'll push the new code after this issue is resolved.

Thanks