microsoft / Phi-3CookBook

This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open sourced AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.
MIT License
2.53k stars 265 forks source link

SK + Phi-3-vision-128k-instruct-onnx-cpu => Not getting Results #153

Closed aherrick closed 1 month ago

aherrick commented 3 months ago

Using the following local Model pulled from HF here is my code:

        var modelPath =
            @"C:\models\Phi-3-vision-128k-instruct-onnx-cpu\cpu-int4-rtn-block-32-acc-level-4";

#pragma warning disable SKEXP0070 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed.

        // create kernel
        var kernel = Kernel
            .CreateBuilder()
            .AddOnnxRuntimeGenAIChatCompletion(
                modelId: "microsoft/Phi-3-vision-128k-instruct",
                //modelId: "Phi-3-vision-128k-instruct-onnx-cp",
                modelPath: modelPath
            )
            .Build();

        // create chat
        var chat = kernel.GetRequiredService<IChatCompletionService>();
        var history = new ChatHistory();

        var testImgPath = Path.Combine(Directory.GetCurrentDirectory(), "imgs", "test.png");

        // create chat collection items
        var collectionItems = new ChatMessageContentItemCollection
        {
            new TextContent("What is the image?"),
            new ImageContent(File.ReadAllBytes(testImgPath), "image/png")
        };
        history.AddUserMessage(collectionItems);

        Console.Write($"Phi3: ");
        var result = await chat.GetChatMessageContentsAsync(history);
        Console.WriteLine(result[^1].Content);

        Console.Read();

I keep getting responses like this: (image below)

 Sorry, I cannot answer this question.

The image is not shown. This is a placeholder for a content that cannot be displayed.</s>

Is there something wrong with my code?

test

leestott commented 3 months ago

@aherrick

The error message you're seeing suggests that the image content is not being properly processed or displayed. Here are a few potential issues and solutions:

1. File Path Issue

2. File Reading Issue

3. Service Configuration

4. Error Handling

Example Code with Error Handling

Here's an example of how you might add error handling to your code:

try
{
    var testImgPath = Path.Combine(Directory.GetCurrentDirectory(), "imgs", "test.png");

    if (!File.Exists(testImgPath))
    {
        Console.WriteLine("Image file not found.");
        return;
    }

    var collectionItems = new ChatMessageContentItemCollection
    {
        new TextContent("What is the image?"),
        new ImageContent(File.ReadAllBytes(testImgPath), "image/png")
    };
    history.AddUserMessage(collectionItems);

    Console.Write($"Phi3: ");
    var result = await chat.GetChatMessageContentsAsync(history);
    Console.WriteLine(result.Content);
}
catch (Exception ex)
{
    Console.WriteLine($"An error occurred: {ex.Message}");
}

Debugging Steps

  1. Log File Paths and Status: Print out the file path and check if the file exists.
  2. Check Image Content: Ensure the image content is correctly read and not null or empty.
  3. Service Logs: Check logs from the AI service to see if there are any errors or warnings related to image processing.

Your code looks mostly valid, but there are a few things to check and consider:

  1. Namespace and Usings: Ensure you have the necessary namespaces and using directives at the top of your file.
  2. Async Context: Since you're using await, make sure your method is marked as async.
  3. Error Handling: Consider adding error handling to manage potential exceptions.

Here's a revised version with these considerations:

using System;
using System.IO;
using Microsoft.Extensions.DependencyInjection;
using YourNamespace.Services; // Replace with actual namespace for IChatCompletionService and other services

public class Program
{
    public static async Task Main(string[] args)
    {
        var modelPath = @"C:\models\Phi-3-vision-128k-instruct-onnx-cpu\cpu-int4-rtn-block-32-acc-level-4";

        #pragma warning disable SKEXP0070 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed.

        // create kernel
        var kernel = Kernel
            .CreateBuilder()
            .AddOnnxRuntimeGenAIChatCompletion(
                modelId: "microsoft/Phi-3-vision-128k-instruct",
                //modelId: "Phi-3-vision-128k-instruct-onnx-cp",
                modelPath: modelPath
            )
            .Build();

        // create chat
        var chat = kernel.GetRequiredService<IChatCompletionService>();
        var history = new ChatHistory();

        var testImgPath = Path.Combine(Directory.GetCurrentDirectory(), "imgs", "test.png");

        // create chat collection items
        var collectionItems = new ChatMessageContentItemCollection
        {
            new TextContent("What is the image?"),
            new ImageContent(File.ReadAllBytes(testImgPath), "image/png")
        };
        history.AddUserMessage(collectionItems);

        Console.Write($"Phi3: ");
        var result = await chat.GetChatMessageContentsAsync(history);
        Console.WriteLine(result.Content);

        Console.Read();
    }
}
kinfey commented 2 months ago

Can you try using ONNX as service ? Maybe you can use Hugging face connector in SK

leestott commented 1 month ago

@aherrick you can see an example of how to use the Hugging Face Connector in Semantic Kernel at https://devblogs.microsoft.com/semantic-kernel/how-to-use-hugging-face-models-with-semantic-kernel/