withcatai / node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Force a JSON schema on the model output on the generation level
https://withcatai.github.io/node-llama-cpp/
MIT License
760 stars 65 forks source link

feat: async operations #178

Closed giladgd closed 4 months ago

giladgd commented 4 months ago

Description of change

How to use node-llama-cpp after this change

Regular context
import {fileURLToPath} from "url";
import path from "path";
import {getLlama, LlamaModel, LlamaContext, LlamaChatSession} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const llama = await getLlama();
const model = await llama.loadModel({
    modelPath: path.join(__dirname, "models", "dolphin-2.1-mistral-7b.Q4_K_M.gguf"),
    onLoadProgress(loadProgress: number) {
        console.log(`Load progress: ${loadProgress * 100}%`);
    }
});
const context = await model.createContext({
    contextSize: Math.min(4096, model.trainContextSize)
});
const session = new LlamaChatSession({
    contextSequence: context.getSequence()
});

const q1 = "Hi there, how are you?";
console.log("User: " + q1);

const a1 = await session.prompt(q1);
console.log("AI: " + a1);

const q2 = "Summerize what you said";
console.log("User: " + q2);

const a2 = await session.prompt(q2);
console.log("AI: " + a2);

Embedding

import {fileURLToPath} from "url";
import path from "path";
import {getLlama, LlamaModel, LlamaEmbeddingContext} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const llama = await getLlama();
const model = await llama.loadModel({
    modelPath: path.join(__dirname, "models", "functionary-small-v2.2.q4_0.gguf")
});
const embeddingContext = await model.createEmbeddingContext({
    contextSize: Math.min(4096, model.trainContextSize)
});

const text = "Hello world";
const embedding = await embeddingContext.getEmbeddingFor(text);

console.log(text, embedding.vector);

Pull-Request Checklist

github-actions[bot] commented 4 months ago

:tada: This PR is included in version 3.0.0-beta.14 :tada:

The release is available on:

Your semantic-release bot :package::rocket: