withcatai / node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
https://node-llama-cpp.withcat.ai
MIT License
893 stars 86 forks source link

feat: reuse context for different chat session history #125

Closed sooraj007 closed 9 months ago

sooraj007 commented 9 months ago

Issue description

So i am creating a question and answer bot i dont want to add my previous chat messages to session, due to adding my context limit get exausted

Expected Behavior

i want the ability to not saving anything to session or ability to clear session contexts

Actual Behavior

after some chat it quicky exhausted and throwing error

Steps to reproduce

after some chat it quicky exhausted and throwing error

My Environment

Dependency Version
Operating System
CPU Intel i9 / Apple M1
Node.js version x.y.zzz
Typescript version x.y.zzz
node-llama-cpp version x.y.zzz

Additional Context

No response

Relevant Features Used

Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, and I know how to start.

giladgd commented 9 months ago

@sooraj007 It'll be possible to do this in the next beta version

giladgd commented 9 months ago

@sooraj007 Until the next beta is ready, you can use this workaround.

The workaround consists of clearing the context sequence and creating a new ChatSession with the messages you want to have in it:

Relevant for the 3.0.0-beta.1 version


import {fileURLToPath} from "url";
import path from "path";
import {LlamaModel, LlamaContext, LlamaChatSession} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const model = new LlamaModel({ modelPath: path.join(__dirname, "models", "dolphin-2.1-mistral-7b.Q4_K_M.gguf") }); const context = new LlamaContext({ model, contextSize: Math.min(4096, model.trainContextSize) }); const contextSequence = context.getSequence(); const session = new LlamaChatSession({ contextSequence });

const q1 = "Hi there, how are you?"; console.log("User: " + q1);

const a1 = await session.prompt(q1); console.log("AI: " + a1);

const q2 = "Summerize what you said"; console.log("User: " + q2);

const a2 = await session.prompt(q2); console.log("AI: " + a2);

// clear the context sequence await contextSequence.eraseContextTokenRanges([start: 0, end: contextSequence.nextTokenIndex]);

// create a new chat session on the same context sequence const newSession = new LlamaChatSession({ contextSequence, conversationHistory: [] // restore your conversation history without the last message that you want to discard });

sooraj007 commented 9 months ago

@giladgd thanks , its works , please create a discord group if you got time it would be great. and node-llama-cpp is awesome great work.

giladgd commented 9 months ago

@sooraj007 It's a great idea :) I'll do that soon after I finish developing the next version