betalgo / openai

OpenAI .NET sdk - Azure OpenAI, ChatGPT, Whisper, and DALL-E
https://betalgo.github.io/openai/
MIT License
2.84k stars 513 forks source link

Assistants API - MessageContent.ImageBinaryContent() does not work - API returns invalid url #569

Closed pappde closed 1 month ago

pappde commented 1 month ago

Describe the bug When I send an image to Assistants API CreateMessage, I get an Error back from the API. It suggests that the request is not correctly serializing.

Your code piece

            prompt = ...
            bytes = ...

            var messageRequest = new MessageCreateRequest
            {
                Role = StaticValues.AssistantsStatics.MessageStatics.Roles.User,
                Content = new([
                    MessageContent.TextContent(prompt),
                    MessageContent.ImageBinaryContent(
                                bytes,
                                ImageStatics.ImageFileTypes.Png,
                                ImageStatics.ImageDetailTypes.High
                            )
                ]),
            };

            var messageResult = await Service.Beta.Messages.CreateMessage(ThreadId, messageRequest, CancelToken);

Result With MessageContent.ImageBinaryContent() I get "Invalid 'content[1].image_url.url'. Expected a valid URL, but got a value with an invalid format."

Expected behavior If I send the exact same thing to ChatCompletionCreateRequest, it works, so I know there isn't anything wrong with the fileId or binary.

Desktop (please complete the following information):

Edit 5/25/24

matisidler commented 1 month ago

same issue here, could you fix it? @pappde

yt3trees commented 1 month ago

It does not appear to be possible to send Base64-encoded images to the Assistants API. https://community.openai.com/t/how-to-send-base64-images-to-assistant-api/752440

kayhantolga commented 1 month ago

I believe this is the same issue @yt3trees mentioned. Please reopen this if it is not. I checked the implementation and couldn't find any issues.

pappde commented 1 month ago

Here is a test I added to MessagesTestHelper:

public static async Task CreateMessageWithImage(IOpenAIService openAI)
{
    ConsoleExtensions.WriteLine("Create MessageWithImage Testing is starting:", ConsoleColor.Cyan);
    var thread = await openAI.Beta.Threads.ThreadCreate();
    if (!thread.Successful)
    {
        if (thread.Error == null)
        {
            throw new("Unknown Error");
        }

        ConsoleExtensions.WriteLine($"{thread.Error.Code}: {thread.Error.Message}", ConsoleColor.Red);
        return;
    }

    var prompt = "Tell me about this image";
    var filename = "image_edit_mask.png";

    var sampleBytes = await FileExtensions.ReadAllBytesAsync($"SampleData/{filename}");

    CreatedThreadId = thread.Id;
    var result = await openAI.Beta.Messages.CreateMessage(CreatedThreadId, new()
    {
        Role = StaticValues.AssistantsStatics.MessageStatics.Roles.User,
        Content = new([
            MessageContent.TextContent(prompt),
            MessageContent.ImageBinaryContent(
                        sampleBytes,
                        ImageStatics.ImageFileTypes.Png,
                        ImageStatics.ImageDetailTypes.High
                    )
        ]),
    }
    );
    if (result.Successful)
    {
        CreatedMessageId = result.Id;
        ConsoleExtensions.WriteLine($"Message Created Successfully with ID: {result.Id}", ConsoleColor.Green);
    }
    else
    {
        ConsoleExtensions.WriteError(result.Error);
    }
}

This test results in the following error message:

Invalid 'content[1].image_url.url'. Expected a valid URL, but got a value with an invalid format.

Something weird about the MessageContentConverter? The serialized json for the request looks fine though. Is it a problem with the API?

PS: the JSON for the request:

{
    "role": "user",
    "content": [
        {
            "type": "text",
            "text": "Tell me about this image"
        },
        {
            "type": "image_url",
            "image_url": {
                "url": "data:image/PNG;base64,iVB...",
                "detail": "high"
            }
        }
    ]
}
pappde commented 1 month ago

I did find and fix the other issue using ImageFileContent(). See PR #574. So at least we have a workaround.

EXPLANATION You'll note this error message.

Missing required parameter: 'content[1].image_file.file_id'. You provided 'file_Id', did you mean to provide 'file_id'?

Note the subtle capitalization discrepancy on "Id" vs "id". That is fixed in PR #...

ORIGINAL ISSUE The original issue is still present with ImageBinaryContent. I do wonder if this is a problem with the API. At a quick glance, it looks like both the ChatCompletion and Assistants code are encoding the data to URL the same. However, when sent as part of ChatCompletionCreateRequest.Messages, it works, but when sent as part of CreateMessageRequest.Content, it doesn't work. Again here is the error message that persists with CreateMessageRequest and ImageBinaryContent:

Invalid 'content[1].image_url.url'. Expected a valid URL, but got a value with an invalid format.

REPRO NOTES: MessagesTestHelper.CreateMessageWithImage - not working VisionTestHelper.RunSimpleVisionTestUsingBase64EncodedImage - working

I would reopen this issue, but don't see any functions to do so.