[BUG]: System.NullReferenceException in ChatSession.ChatAsync upgrading to 0.18

grammophone commented 2 weeks ago

Description

I just upgraded the NuGet packages from 0.16 to 0.18. CPU backend. The simple test console program still compiled but it no longer runs. Inside ChatSession.ChatAsync I get the exception:

System.NullReferenceException: 'Object reference not set to an instance of an object.'

LLama.InteractiveExecutor.InferInternal(LLama.Abstractions.IInferenceParams, LLama.StatefulExecutorBase.InferStateArgs)
LLama.StatefulExecutorBase.InferAsync(string, LLama.Abstractions.IInferenceParams, System.Threading.CancellationToken)
LLama.LLamaTransforms.KeywordTextOutputStreamTransform.TransformAsync(System.Collections.Generic.IAsyncEnumerable<string>)
LLama.LLamaTransforms.KeywordTextOutputStreamTransform.TransformAsync(System.Collections.Generic.IAsyncEnumerable<string>)
LLama.ChatSession.ChatAsyncInternal(string, LLama.Abstractions.IInferenceParams, System.Threading.CancellationToken)
LLama.ChatSession.ChatAsyncInternal(string, LLama.Abstractions.IInferenceParams, System.Threading.CancellationToken)
LLama.ChatSession.ChatAsync(LLama.Common.ChatHistory.Message, bool, LLama.Abstractions.IInferenceParams, System.Threading.CancellationToken)
LLama.ChatSession.ChatAsync(LLama.Common.ChatHistory.Message, bool, LLama.Abstractions.IInferenceParams, System.Threading.CancellationToken)
LLamaSharpConsole.Program.Main(string[]) in Program.cs

Reproduction Steps

using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using LLama;
using LLama.Common;
using LLama.Sampling;

namespace LLamaSharpConsole
{
    internal class Program
    {
        static async Task Main(string[] args)
        {
            string modelPath = @"D:\Projects\dotNet\Solutions\LLMPlayground\Models\model.gguf";
            //string modelPath = @"D:\Projects\dotNet\Solutions\LLMPlayground\Models\model.quantized.gguf";

            var parameters = new ModelParams(modelPath)
            {
                ContextSize = 4096,
                GpuLayerCount = 10
            };
            using var model = LLamaWeights.LoadFromFile(parameters);
            using var context = model.CreateContext(parameters);
            var executor = new InteractiveExecutor(context);

            ChatHistory chatHistory = new ChatHistory();

            ChatSession session = new(executor, chatHistory);

            session.WithOutputTransform(new LLamaTransforms.KeywordTextOutputStreamTransform(
                    new string[] { "User:", "Assistant:", "�" },
                    redundancyLength: 8));

            InferenceParams inferenceParams = new InferenceParams()
            {
                AntiPrompts = new List<string> { "User:", "Assistant:" },
                //SamplingPipeline = new Mirostat2SamplingPipeline()
                //{
                //  Tau = 1f,
                //  Eta = 0.2f
                //}
            };

            Console.ForegroundColor = ConsoleColor.Yellow;
            Console.WriteLine("The chat session has started.");
            Console.Write("\r\nReady> ");

            // show the prompt
            Console.ForegroundColor = ConsoleColor.Green;
            string userInput = Console.ReadLine() ?? "";

            while (userInput != "exit")
            {
                Console.ForegroundColor = ConsoleColor.White;

                await foreach (
                        var text
                        in session.ChatAsync(
                                new ChatHistory.Message(AuthorRole.User, userInput + "\nAssistant: "),
                                inferenceParams))
                {
                    Console.Write(text);
                }

                Console.ForegroundColor = ConsoleColor.Yellow;
                Console.Write("\r\nReady> ");

                Console.ForegroundColor = ConsoleColor.Green;
                userInput = Console.ReadLine() ?? "";
            }

            Console.ForegroundColor = ConsoleColor.White;
        }

    } //end of class Program
}

Environment & Configuration

Windows 10
.NET 8.0
LLamaSharp version: 0.18
CUDA version (if you are using cuda backend): CPU backend
CPU & GPU device: x64

Known Workarounds

No response

martindevans commented 2 weeks ago

The null refererence exception is caused by the InferenceParams.SamplingPipeline being null. It's meant to default to the DefaultSamplingPipeline if you don't set it, so that looks like a bug in the LLamaInteractExecutor.

robertmuehsig commented 2 weeks ago

Adding this line should fix it in 0.18.0

InferenceParams inferenceParams = new InferenceParams()
{
    SamplingPipeline = new DefaultSamplingPipeline(), // Use default sampling pipeline. <-- not in sample
    MaxTokens = 256, // No more than 256 tokens should appear in answer. Remove it if antiprompt is enough for control.
    AntiPrompts = new List<string> { "User:" } // Stop generation once antiprompts appear.
};

grammophone commented 2 weeks ago

Thank you for your prompt responses, @martindevans and @robertmuehsig! Indeed, specifying the pipeline worked.

SciSharp / LLamaSharp