microsoft / autogen

A programming framework for agentic AI 🤖
https://microsoft.github.io/autogen/
Creative Commons Attribution 4.0 International
34.8k stars 5.04k forks source link

[BUG] UriFormatException when handling long image Data URLs in ImageMessage #4266

Closed gabrielschmith closed 4 days ago

gabrielschmith commented 6 days ago

What happened?

I am using AutoGen with the Semantic Kernel extension to create an agent. This agent can handle messages of type ImageMessage, but I am encountering an issue when processing image data. Specifically, I am converting an image received as a Data URL into a BinaryData object and passing it to the ImageMessage. However, this results in an exception.

....

Exception Details:
If the Data URL is too long (exceeding 65,535 characters), a UriFormatException is thrown with the following message:

System.UriFormatException: Invalid URI: The Uri string is too long.  
   at System.Uri.CreateThis(String uri, Boolean dontEscape, UriKind uriKind, UriCreationOptions& creationOptions)  
   at System.Uri..ctor(String uriString)  
   at AutoGen.Core.ImageMessage..ctor(Role role, String url, String from, String mimeType)  
   at Softplan.Ea.Pilot.Application.OpenAi.Chats.OpenAiChatService.BuildChatHistory(List`1 messages)+MoveNext()  
   at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)  
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)  
   at Softplan.Ea.Pilot.Application.OpenAi.Chats.OpenAiChatService.<>c__DisplayClass12_0.<<CreateCompletionsAsync>b__0>d.MoveNext()  

This issue also occurs when I directly pass the Data URL to the ImageMessage, as shown in the following code examples:

foreach (var content in message.Content)  
{  
    yield return new ImageMessage(role, content.ImageUrl.Image);  // BinaryData
}  

foreach (var content in message.Content)  
{  
    yield return new ImageMessage(role, content.ImageUrl.Url);  // string Url
}  

In the SemanticKernelChatMessageContentConnector, the issue arises in this block of code:

private IEnumerable<ChatMessageContent> ProcessMessageForOthers(ImageMessage message)  
{  
    var collectionItems = new ChatMessageContentItemCollection();  
    collectionItems.Add(new ImageContent(new Uri(message.Url ?? message.BuildDataUri())));  
    return [new ChatMessageContent(AuthorRole.User, collectionItems)];  
}  

The exception occurs because, even though I send BinaryData, the code still attempts to generate a URL, leading to the error.

What did you expect to happen?

The agent should be able to process long Data URLs without encountering a UriFormatException. Alternatively, if Data URLs are not supported directly, there should be clear guidelines on handling large binary data for images in ImageMessage.

How can we reproduce it (as minimally and precisely as possible)?

  1. Use a long image Data URL (e.g., base64-encoded JPEG) and pass it to ImageMessage either directly or via a BinaryData object.
  2. Observe that the UriFormatException is thrown.

Please let me know if additional details or examples are needed to address this issue.

AutoGen version

0.2.1

Which package was this bug in

Core

Model used

gpt-4o-mini

Python version

No response

Operating system

Windows Pro 11, .NET 8

Any additional info you think would be helpful for fixing this bug

There is also an issue opened for the Semantic Kernel here that discusses the use and support of Data URIs.

jackgerrits commented 6 days ago

@LittleLittleCloud would you be able to take a look?

LittleLittleCloud commented 5 days ago

@gabrielschmith We just push a fix which is available from nightly build

Let us know if it fixes the issue you have, thanks!

gabrielschmith commented 4 days ago

@LittleLittleCloud,

Thank you for the quick fix! I just tested it, and it works perfectly with both BinaryData and Data URL in string format. Great job! 🎉

Best regards,
@gabrielschmith