dotnet-smartcomponents / smartcomponents

Experimental, end-to-end AI features for .NET apps
695 stars 60 forks source link

Missing Method Exception using .AddLocalTextEmbeddingsGeneration(); #75

Open davidpuplava opened 1 month ago

davidpuplava commented 1 month ago

I am using an Onnx model directly for chat completion:

var builder = Kernel.CreateBuilder();
builder.AddOnnxRuntimeGenAIChatCompletion("phi3", @"C:\git\Phi-3-mini-4k-instruct-onnx\cpu_and_mobile\cpu-int4-rtn-block-32")
.AddLocalTextEmbeddingGeneration();

But a call to a SemanticTextMemory object's .SaveInformationAsync(...) method gives the following error:

System.MissingMethodException
  HResult=0x80131513
  Message=Method not found: 'System.ValueTuple`3<System.ReadOnlyMemory`1<Int64>,System.ReadOnlyMemory`1<Int64>,System.ReadOnlyMemory`1<Int64>> FastBertTokenizer.BertTokenizer.Encode(System.String, Int32, System.Nullable`1<Int32>)'.
  Source=SmartComponents.LocalEmbeddings
  StackTrace:
   at SmartComponents.LocalEmbeddings.LocalEmbedder.Embed[TEmbedding](String inputText, Nullable`1 outputBuffer, Int32 maximumTokens)
   at SmartComponents.LocalEmbeddings.LocalEmbedder.Embed(String inputText, Int32 maximumTokens)
   at SmartComponents.LocalEmbeddings.SemanticKernel.LocalTextEmbeddingGenerationService.GenerateEmbeddingsAsync(IList`1 data, Kernel kernel, CancellationToken cancellationToken)
   at Microsoft.SemanticKernel.Embeddings.EmbeddingGenerationExtensions.<GenerateEmbeddingAsync>d__0`2.MoveNext()
   at Microsoft.SemanticKernel.Memory.SemanticTextMemory.<SaveInformationAsync>d__3.MoveNext()
   at LocalChat.Helpers.MemoryHelper.<PopulateInterestingFacts>d__0.MoveNext() in C:\git\ai-agent-sk\LocalChat\LocalChat\Helpers\MemoryHelper.cs:line 19

Generally speaking, I followed this Blog Post to test out Semantic Kernel with local RAG, but want to avoid using a local HTTP server which is why I used the OnnxRuntimeGenAIChatCompletion for Phi-3: https://techcommunity.microsoft.com/t5/educator-developer-blog/building-intelligent-applications-with-local-rag-in-net-and-phi/ba-p/4175721.

For what it's worth, I downloaded the bge-micro-v2 model directly from HuggingFace and was able to get this to work by using the .AddBertOnnxTextEmbeddingsGeneration(...) extension method.

var modelPath = @"C:\git\Phi-3-mini-4k-instruct-onnx\cpu_and_mobile\cpu-int4-rtn-block-32";
var textModelPath = @"C:\git\bge-micro-v2\onnx\model.onnx";
var foo = @"C:\git\bge-micro-v2\vocab.txt";

var builder = Kernel.CreateBuilder();
builder.AddOnnxRuntimeGenAIChatCompletion("phi3", modelPath)
    .AddBertOnnxTextEmbeddingGeneration(textModelPath, foo);
    //.AddLocalTextEmbeddingGeneration();
joslat commented 4 weeks ago

I am having exactly the same issue, any idea? I followed this post though, but both do the same :) https://arafattehsin.com/ai-copilot-offline-phi3-semantic-kernel/

davidpuplava commented 4 weeks ago

That is the other post I used as well! Thanks for finding it (I couldn't for the life of me find it when I opened the issue.)

I don't know the root cause of what inside SmartComponents is causing the error. I don't think the source code is up yet (if it will at all).

But try changing the following line of code (line 21 in your linked article)

builder.AddLocalTextEmbeddingGeneration();

with the .AddBertOnnxTextEmbeddingsGeneration(...) line I have above. You can download the textModelPath file from here: https://huggingface.co/TaylorAI/bge-micro-v2

Hit the "..." button next to "Train" and select clone repository for instructions how how to pull the large files with git image

Wherever you clone the files tool, use the path for the "model.onnx" and "vocab.txt" and pass that in.

The only other thing was to add the right libraries to get that AddBertOnnxTextEmbeddingGeneration extension method. I think it's in the Microsoft.ML.OnnxRuntime nuget package. Not sure though. He's all the packages I referenced to get this working: image

Give it a try and let me know if you get it working. Good luck!

joslat commented 4 weeks ago

Whoa! thanks a ton this worked!! here's this for you @davidpuplava! 🥇