LLM RAG + Custom System Messages produces very mixed output

iocron commented 5 months ago

Describe the bug LLM produces no answer, sometimes weird answers, and other times it seems like the system message gets ignored fully when using the combination of ollama with RAG + custom system message. It generates a thread with 2 system messages (one from the RAG implementation). Maybe the system messages get mixed up somehow (I haven't gone through the lingoose implementation yet to check).

Another observed problem is that using .WithModel() on a embedder for RAG is not working, because it retrieves only the first document. By removing .WithModel() the retrieval works as expected.

To Reproduce

// go test ./... -v
func TestOllamaAssistantWithRag(t *testing.T) {
    rag := rag.New(
        index.New(
            jsondb.New().WithPersist("index.json"),
            ollamaembedder.New(), // does not work .WithModel(...)
        ),
    ).WithChunkSize(1024).WithChunkOverlap(0)

    err := rag.AddDocuments(
        context.Background(),
        document.Document{
            Content: "this is some text about hello world",
            Metadata: types.Meta{
                "author": "Wikipedia",
            },
        },
        document.Document{
            Content: "this is a little side story in paris about a little mermaid",
            Metadata: types.Meta{
                "author": "Wikipedia",
            },
        },
    )
    if err != nil {
        t.Error("Not able to add document to RAG..\n", err)
    }

    ragQueryData, _ := rag.Retrieve(context.Background(), "where is the little mermaid")
    t.Log("ragQueryData: ", ragQueryData)

    userMessage := "Tell me a short joke."
    llmAssistant, _ := OllamaAssistantNew(OllamaAssistantOptions{
        Model:         "llama3",
        UserMessage:   userMessage,
        Rag:           rag,
        SystemMessage: "You are a expert in counting words. Return the number of words.",
    })
    answer, _ := OllamaAssistantLastMessage(llmAssistant)
    answerCheck, err := regexp.MatchString("^[0-9 ]+$", answer)

    t.Log(llmAssistant.Thread())

    if err != nil || !answerCheck {
        t.Error("\nAnswer must be a number!")
    }
}

Expected behavior There should be at least some answer/output generated, but it's most of the time empty while using a RAG with a own system message. Sometimes it produces partly random output. Here is the Thread I get from it (I also had rarely situations where it gave me a somewhat correct answer):

Thread:
        system:
            Type: text
            Text: You are a number counting assistant. Return the sum of words as number.
        system:
            Type: text
            Text: You name is AI assistant, and you are a helpful and polite assistant . Your task is to assist humans with their questions.
        user:
            Type: text
            Text: Use the following pieces of retrieved context to answer the question.

        Question: Tell me a short joke.
        Context:
        this is a little side story in paris about a little mermaid

        assistant:
            Type: text
            Text:

Desktop (please complete the following information):

OS: macOS 14.4

Additional information It would be nice to be able to override/set/clear the system message(s) in general, instead of only being able to add new ones.

henomis commented 5 months ago

Hi @iocron , it seems you are using some custom implementation of assistant, did you experience the same behaviour with lingoose's assistant? Could you provide an example in such way I can try to replicate?

iocron commented 4 months ago

Hi @henomis the custom assistant implementation is only a wrapper/helper function for creating a new lingoose ollama assistant faster/easier (by using a options type). I've created a slim version of my code to provide a simple replication of the issue, hope that helps: https://github.com/iocron/lingoose-issue-208

btw. thanks for creating such a good library / lingoose, really appreciate the work :)

henomis / lingoose

LLM RAG + Custom System Messages produces very mixed output #208