withcatai / node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Force a JSON schema on the model output on the generation level
https://withcatai.github.io/node-llama-cpp/
MIT License
735 stars 63 forks source link

feat: token biases #196

Closed giladgd closed 2 months ago

giladgd commented 2 months ago

Description of change

Fixes #192

How to use TokenBias

Here is an example of to increase the probability of the word "hello" being generated and prevent the word "day" from being generated:

import {fileURLToPath} from "url";
import path from "path";
import {getLlama, LlamaChatSession, TokenBias} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const llama = await getLlama();
const model = await llama.loadModel({
    modelPath: path.join(__dirname, "models", "dolphin-2.1-mistral-7b.Q4_K_M.gguf")
});
const context = await model.createContext();
const session = new LlamaChatSession({
    contextSequence: context.getSequence()
});

const q1 = "Hi there, how are you?";
console.log("User: " + q1);

const a1 = await session.prompt(q1, {
    tokenBias: (new TokenBias(model))
        .set("Hello", 1)
        .set("hello", 1)
        .set("Day", "never")
        .set("day", "never")
        .set(model.tokenize("day"), "never") // you can also do this to set bias for specific tokens
});
console.log("AI: " + a1);

Pull-Request Checklist

github-actions[bot] commented 2 months ago

:tada: This PR is included in version 3.0.0-beta.16 :tada:

The release is available on:

Your semantic-release bot :package::rocket: