SciSharp / LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
https://scisharp.github.io/LLamaSharp
MIT License
2.64k stars 343 forks source link

How do i use RAG by kernel memory and Semantic kernel Handlebar Planner with llama3 #899

Open Barshan-Mandal opened 2 months ago

Barshan-Mandal commented 2 months ago

Description

i am trying the below code but getting multiple errors


using LLama.Common;
using LLamaSharp.KernelMemory;
using Microsoft.KernelMemory;
using Microsoft.KernelMemory.Configuration;
using System.Diagnostics;

namespace ConsoleAppSemanticKernel
{
    internal class Program
    {
        private static async Task Main(string[] args)
        {
            await KernelMemory.Run();
        }
        public class KernelMemory
        {
            public static async Task Run()
            {
                Console.ForegroundColor = ConsoleColor.Yellow;
                Console.WriteLine(
                    """

                This program uses the Microsoft.KernelMemory package to ingest documents
                and answer questions about them in an interactive chat prompt.

                """);

                // Setup the kernel memory with the LLM model
                string modelPath = @"G:\E AI\GPT4All\Meta-Llama-3-8B-Instruct.Q4_0.gguf";
                IKernelMemory memory = CreateMemory(modelPath);

                // Ingest documents (format is automatically detected from the filename)
                string[] filesToIngest = [
                    //@"E:\Gpt4All documents\AI_Russell_Norvig - converted_2.pdf"
                    @"C:\Users\Windows Programming\Desktop\testllama3.txt"
                    //Path.GetFullPath(@"./Assets/sample-SK-Readme.pdf"),
                    //Path.GetFullPath(@"./Assets/sample-KM-Readme.pdf"),
                ];

                for (int i = 0; i < filesToIngest.Length; i++)
                {
                    string path = filesToIngest[i];
                    Stopwatch sw = Stopwatch.StartNew();
                    Console.ForegroundColor = ConsoleColor.Blue;
                    Console.WriteLine($"Importing {i + 1} of {filesToIngest.Length}: {path}");
                    await memory.ImportDocumentAsync(path, steps: Constants.PipelineWithoutSummary);
                    Console.WriteLine($"Completed in {sw.Elapsed}\n");
                }

                // Ask a predefined question
                Console.ForegroundColor = ConsoleColor.Green;
                string question1 = "What formats does KM support";
                Console.WriteLine($"Question: {question1}");
                await AnswerQuestion(memory, question1);

                // Let the user ask additional questions
                while (true)
                {
                    Console.ForegroundColor = ConsoleColor.Green;
                    Console.Write("Question: ");
                    string question = Console.ReadLine()!;
                    if (string.IsNullOrEmpty(question))
                        return;

                    await AnswerQuestion(memory, question);
                }
            }

            private static IKernelMemory CreateMemory(string modelPath)
            {
                InferenceParams infParams = new() { AntiPrompts = ["\n\n"] };

                LLamaSharpConfig lsConfig = new(modelPath) { DefaultInferenceParams = infParams };

                SearchClientConfig searchClientConfig = new()
                {
                    MaxMatchesCount = 1,
                    AnswerTokens = 100,
                };

                TextPartitioningOptions parseOptions = new()
                {
                    MaxTokensPerParagraph = 300,
                    MaxTokensPerLine = 100,
                    OverlappingTokens = 30
                };

                return new KernelMemoryBuilder()
                    .WithLLamaSharpDefaults(lsConfig)
                    .WithSearchClientConfig(searchClientConfig)
                    .With(parseOptions)
                    .Build();
            }

            private static async Task AnswerQuestion(IKernelMemory memory, string question)
            {
                Stopwatch sw = Stopwatch.StartNew();
                Console.ForegroundColor = ConsoleColor.DarkGray;
                Console.WriteLine($"Generating answer...");

                MemoryAnswer answer = await memory.AskAsync(question);
                Console.WriteLine($"Answer generated in {sw.Elapsed}");

                Console.ForegroundColor = ConsoleColor.Gray;
                Console.WriteLine($"Answer: {answer.Result}");
                foreach (var source in answer.RelevantSources)
                {
                    Console.WriteLine($"Source: {source.SourceName}");
                }
                Console.WriteLine();
            }
        }
    }
}

problwm always arises in memory.askasync()

System.MissingMethodException HResult=0x80131513 Message=Method not found: 'Double Microsoft.KernelMemory.AI.TextGenerationOptions.get_TopP()'. Source=LLamaSharp.KernelMemory StackTrace: at LLamaSharp.KernelMemory.LlamaSharpTextGenerator.OptionsToParams(TextGenerationOptions options, InferenceParams defaultParams) at LLamaSharp.KernelMemory.LlamaSharpTextGenerator.GenerateTextAsync(String prompt, TextGenerationOptions options, CancellationToken cancellationToken) at Microsoft.KernelMemory.Search.SearchClient.GenerateAnswer(String question, String facts, IContext context, CancellationToken token) at Microsoft.KernelMemory.Search.SearchClient.<AskAsync>d__8.MoveNext() at ConsoleAppSemanticKernel.Program.KernelMemory.<AnswerQuestion>d__2.MoveNext() in O:\Windows For Programming\Projects\Visual Studio\Console\ConsoleAppSemanticKernel\Program.cs:line 104 at ConsoleAppSemanticKernel.Program.KernelMemory.<Run>d__0.MoveNext() in O:\Windows For Programming\Projects\Visual Studio\Console\ConsoleAppSemanticKernel\Program.cs:line 57 at ConsoleAppSemanticKernel.Program.<Main>d__0.MoveNext() in O:\Windows For Programming\Projects\Visual Studio\Console\ConsoleAppSemanticKernel\Program.cs:line 16

llama_get_logits_ith: invalid logits id 163, reason: no logits Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object. at LLama.Native.LLamaTokenDataArray.Create(ReadOnlySpan1 logits, Memory1 buffer) at LLama.Sampling.BaseSamplingPipeline.Sample(SafeLLamaContextHandle ctx, Span1 logits, ReadOnlySpan1 lastTokens) at LLama.Sampling.ISamplingPipelineExtensions.Sample(ISamplingPipeline pipeline, SafeLLamaContextHandle ctx, Span1 logits, List1 lastTokens) at LLama.StatelessExecutor.InferAsync(String prompt, IInferenceParams inferenceParams, CancellationToken cancellationToken)+MoveNext() at LLama.StatelessExecutor.InferAsync(String prompt, IInferenceParams inferenceParams, CancellationToken cancellationToken)+System.Threading.Tasks.Sources.IValueTaskSource<System.Boolean>.GetResult() at Microsoft.KernelMemory.Search.SearchClient.AskAsync(String index, String question, ICollection1 filters, Double minRelevance, IContext context, CancellationToken cancellationToken) at Microsoft.KernelMemory.Search.SearchClient.AskAsync(String index, String question, ICollection1 filters, Double minRelevance, IContext context, CancellationToken cancellationToken) at ConsoleAppSemanticKernel.Program.KernelMemory.AnswerQuestion(IKernelMemory memory, String question) in O:\Windows For Programming\Projects\Visual Studio\Console\ConsoleAppSemanticKernel\Program.cs:line 104 at ConsoleAppSemanticKernel.Program.KernelMemory.Run() in O:\Windows For Programming\Projects\Visual Studio\Console\ConsoleAppSemanticKernel\Program.cs:line 57 at ConsoleAppSemanticKernel.Program.Main(String[] args) in O:\Windows For Programming\Projects\Visual Studio\Console\ConsoleAppSemanticKernel\Program.cs:line 16 at ConsoleAppSemanticKernel.Program.<Main>(String[] args)

how do i add kernel handlebars planners here ? how do i add function calling?

philipbawn commented 2 months ago

I am affected too.

Question: does this work? Generating answer... llama_new_context_with_model: n_ctx = 2048 llama_new_context_with_model: n_batch = 512 llama_new_context_with_model: n_ubatch = 512 llama_new_context_with_model: flash_attn = 0 llama_new_context_with_model: freq_base = 10000.0 llama_new_context_with_model: freq_scale = 1 llama_kv_cache_init: CPU KV buffer size = 256.00 MiB llama_new_context_with_model: KV self size = 256.00 MiB, K (f16): 128.00 MiB, V (f16): 128.00 MiB llama_new_context_with_model: CPU output buffer size = 0.02 MiB llama_new_context_with_model: CPU compute buffer size = 164.01 MiB llama_new_context_with_model: graph nodes = 1030 llama_new_context_with_model: graph splits = 1 llama_get_logits_ith: invalid logits id 349, reason: no logits Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object. at LLama.Native.LLamaTokenDataArray.Create(ReadOnlySpan1 logits, Memory1 buffer) at LLama.Sampling.BaseSamplingPipeline.Sample(SafeLLamaContextHandle ctx, Span1 logits, ReadOnlySpan1 lastTokens) at LLama.Sampling.ISamplingPipelineExtensions.Sample(ISamplingPipeline pipeline, SafeLLamaContextHandle ctx, Span1 logits, List1 lastTokens) at LLama.StatelessExecutor.InferAsync(String prompt, IInferenceParams inferenceParams, CancellationToken cancellationToken)+MoveNext() at LLama.StatelessExecutor.InferAsync(String prompt, IInferenceParams inferenceParams, CancellationToken cancellationToken)+System.Threading.Tasks.Sources.IValueTaskSource.GetResult() at Microsoft.KernelMemory.Search.SearchClient.AskAsync(String index, String question, ICollection1 filters, Double minRelevance, IContext context, CancellationToken cancellationToken) at Microsoft.KernelMemory.Search.SearchClient.AskAsync(String index, String question, ICollection1 filters, Double minRelevance, IContext context, CancellationToken cancellationToken) at LLama.Examples.Examples.LLKernelMemory.AnswerQuestion(IKernelMemory memory, String question) at LLama.Examples.Examples.LLKernelMemory.Run() at ConsoleApp1.Program.Main(String[] args) at ConsoleApp1.Program.

(String[] args)

aropb commented 2 months ago

Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object. at LLama.Native.LLamaTokenDataArray.Create(ReadOnlySpan1 logits, Memory1 buffer) at LLama.Sampling.BaseSamplingPipeline.Sample(SafeLLamaContextHandle ctx, Span1 logits, ReadOnlySpan1 lastTokens) at LLama.Sampling.ISamplingPipelineExtensions.Sample(ISamplingPipeline pipeline, SafeLLamaContextHandle ctx, Span1 logits, List1 lastTokens) at LLama.StatelessExecutor.InferAsync(String prompt, IInferenceParams inferenceParams, CancellationToken cancellationToken)+MoveNext() at LLama.StatelessExecutor.InferAsync(String prompt, IInferenceParams inferenceParams, CancellationToken cancellationToken)+System.Threading.Tasks.Sources.IValueTaskSource.GetResult()

There is already such a mistake: https://github.com/SciSharp/LLamaSharp/issues/891

zsogitbe commented 1 month ago

Unfortunately, there are several issues with your question, already starting with the title "How do i use RAG by kernel memory and Semantic kernel Handlebar Planner with llama3". It is clear from you question that you do not understand what KernelMemory and SemanticKernel are and how to use them. If you don't understand these two libraries, then it is easy to make mistakes and get the errors you have reported. It is difficult to help you in this way.

I would like to suggest that you first learn a bit more about LLamaSharp, KernelMemory and SemanticKernel. Then, you try the examples in the package. And then you make some simple examples yourself. In this way you will be able to make good working code.

Explanation about the title of your question, why it is wrong: Handlebar Planner is a SemanticKernel feature, but you only reference in the code KernelMemory. SemanticKernel and KernelMemory can work together, for this you need to provide KernelMemoryas as plugin to SemanticKernel. Etc.