Closed deepinderdeol closed 2 months ago
@deepinderdeol Thank you for reporting this issue. At the moment, Azure .NET SDK for OpenAI allows to pass image as Uri
object only with a link to an image hosted remotely. As soon as base64-encoded format will be supported, we will update Semantic Kernel SDK as well. Thanks again!
@dmytrostruk Can't we use the OpenAI API which already has this implemented? The longer I use SK the more I get the impression that most of the features don't work or are not yet implemented.
@dmytrostruk Can't we use the OpenAI API which already has this implemented? The longer I use SK the more I get the impression that most of the features don't work or are not yet implemented.
@Alerinos There are a couple of ways how to use OpenAI functionality - use already existing SDKs or implement our own logic to perform requests. Each approach has its advantages and disadvantages. The main advantage in using Azure .NET SDK is to re-use a lot of functionality that is already implemented, tested and available rather than implementing our own from scratch. This allows us to focus on Semantic Kernel core functionality.
But if you want to use SK with some OpenAI features which are not available yet, it's still possible to implement custom connector, add all necessary logic to use OpenAI API and inject it to Kernel instance. Here is an example: https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/KernelSyntaxExamples/Example16_CustomLLM.cs
+1 on this. Looks like this is part of the Azure SDK's 1.0 release: https://github.com/Azure/azure-sdk-for-net/blob/Azure.AI.OpenAI_1.0.0-beta.12/sdk/openai/Azure.AI.OpenAI/README.md#chat-with-images-using-gpt-4-vision-preview
The raw image goes in as an image URL but the url is a data url such as data:image/png;base64, ...
+1 on this. Looks like this is part of the Azure SDK's 1.0 release: https://github.com/Azure/azure-sdk-for-net/blob/Azure.AI.OpenAI_1.0.0-beta.12/sdk/openai/Azure.AI.OpenAI/README.md#chat-with-images-using-gpt-4-vision-preview
The raw image goes in as an image URL but the url is a data url such as data:image/png;base64, ...
The capability is there but it just takes 64 KB max. Not more than that..
Can we maybe get an update on this? It is an important functionality, specifically for security reasons, it seems like the only workaround to this is making the images available, hosted somewhere, and this is something that should be possible to avoid when the data is sensitive. Azure-hosted models are ideal for use cases where the data should stay behind the firewalls as much as possible, sending the image to the model directly as data is a necessity.
It seems that the Semantic Kernel is ready to support this, but the Azure AI sdk is not
Supporting base64 images is also very important for test\development scenarios
I have created a new PR in the Azure SDK for NET repo that will allow us to finally close this issue. https://github.com/Azure/azure-sdk-for-net/pull/43093
Any update on this? Just thinking if possible to insert image into promt, as image has to be in that position.
One of the long hanging fruits, it has become..
Binary + Mime Type seems to be supported now: https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/openai/Azure.AI.OpenAI/README.md#chat-with-images-using-gpt-4-turbo
I am assuming that adding a condition based on Uri being present or not to use the proper ctor overload would do the trick? (reference)
const string rawImageUri = "<URI to your image>";
using Stream jpegImageStream = File.OpenRead("<path to a local image file>");
ChatCompletionsOptions chatCompletionsOptions = new()
{
DeploymentName = "gpt-4-turbo",
Messages =
{
new ChatRequestSystemMessage("You are a helpful assistant that describes images."),
new ChatRequestUserMessage(
new ChatMessageTextContentItem("Hi! Please describe these images"),
new ChatMessageImageContentItem(new Uri(rawImageUri)),
new ChatMessageImageContentItem(jpegImageStream, "image/jpg", ChatMessageImageDetailLevel.Low)),
},
};
@RogerBarreto, is this something you're tracking as part of the graduation of the content types (in particular ImageContent
)?
It worked with v1.14.1, thanks a lot, to everyone involved!
Working in v1.14.1 for me as well!
This gpt-4-vision sample works with sample image provided in the sample code: https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/KernelSyntaxExamples/Example68_GPTVision.cs
However, using a local image file as ImageContent results in an exception.
I tried following instructions on the OpenAI site for using base64-encoded format, but haven't been successful: https://platform.openai.com/docs/guides/vision