withcatai / node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
https://node-llama-cpp.withcat.ai
MIT License
893 stars 86 forks source link

feat: manual binding loading #153

Closed giladgd closed 8 months ago

giladgd commented 8 months ago

Description of change

Fixes #106 May help with #135

How to use node-llama-cpp after this change

import {fileURLToPath} from "url";
import path from "path";
import {getLlama, LlamaModel, LlamaContext, LlamaChatSession} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const llama = await getLlama();
const model = new LlamaModel({
    llama,
    modelPath: path.join(__dirname, "models", "dolphin-2.1-mistral-7b.Q4_K_M.gguf")
});
const context = new LlamaContext({
    model,
    contextSize: Math.min(4096, model.trainContextSize)
});
const session = new LlamaChatSession({
    contextSequence: context.getSequence()
});

const q1 = "Hi there, how are you?";
console.log("User: " + q1);

const a1 = await session.prompt(q1);
console.log("AI: " + a1);

const q2 = "Summerize what you said";
console.log("User: " + q2);

const a2 = await session.prompt(q2);
console.log("AI: " + a2);

You can pass parameters to getLlama to use another binary or customize binding options:

import {getLlama, LlamaModel, LlamaContext, LlamaChatSession} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const llama = await getLlama({
    logLevel: LlamaLogLevel.warn, // disable info and debug logs from llama.cpp
    cuda: true // use a binary with CUDA enabled
});
const model = new LlamaModel({
    llama,
    modelPath: path.join(__dirname, "models", "dolphin-2.1-mistral-7b.Q4_K_M.gguf")
});

Pull-Request Checklist

github-actions[bot] commented 8 months ago

:tada: This PR is included in version 3.0.0-beta.6 :tada:

The release is available on:

Your semantic-release bot :package::rocket:

github-actions[bot] commented 1 week ago

:tada: This PR is included in version 3.0.0 :tada:

The release is available on:

Your semantic-release bot :package::rocket: