feat: async operations - Githubissues

Description of change

feat: async model loading
feat: async context creation
feat: export TemplateChatWrapperOptions
feat: detect cmake binary issues and suggest fixes on detection
feat: automatically try to resolve Failed to detect a default CUDA architecture CUDA compilation error
fix: adapt to breaking llama.cpp changes to make embedding work again
fix: adapt to breaking llama.cpp changes to support mamba models
fix: show console log prefix on postinstall
fix: call logger with last llama.cpp logs before exit
fix: rename .buildMetadata.json to not start with a dot, to make using this library together with bundlers easier
fix: DisposedError was thrown when calling .dispose()

How to use node-llama-cpp after this change

Regular context

import {fileURLToPath} from "url";
import path from "path";
import {getLlama, LlamaModel, LlamaContext, LlamaChatSession} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const llama = await getLlama();
const model = await llama.loadModel({
    modelPath: path.join(__dirname, "models", "dolphin-2.1-mistral-7b.Q4_K_M.gguf"),
    onLoadProgress(loadProgress: number) {
        console.log(`Load progress: ${loadProgress * 100}%`);
    }
});
const context = await model.createContext({
    contextSize: Math.min(4096, model.trainContextSize)
});
const session = new LlamaChatSession({
    contextSequence: context.getSequence()
});

const q1 = "Hi there, how are you?";
console.log("User: " + q1);

const a1 = await session.prompt(q1);
console.log("AI: " + a1);

const q2 = "Summerize what you said";
console.log("User: " + q2);

const a2 = await session.prompt(q2);
console.log("AI: " + a2);

Embedding

import {fileURLToPath} from "url";
import path from "path";
import {getLlama, LlamaModel, LlamaEmbeddingContext} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const llama = await getLlama();
const model = await llama.loadModel({
    modelPath: path.join(__dirname, "models", "functionary-small-v2.2.q4_0.gguf")
});
const embeddingContext = await model.createEmbeddingContext({
    contextSize: Math.min(4096, model.trainContextSize)
});

const text = "Hello world";
const embedding = await embeddingContext.getEmbeddingFor(text);

console.log(text, embedding.vector);

Pull-Request Checklist

[x] Code is up-to-date with the master branch
[x] npm run format to apply eslint formatting
[x] npm run test passes with this change
[x] This pull request links relevant issues as Fixes #0000
[x] There are new or updated unit tests validating the change
[ ] Documentation has been updated to reflect this change
[x] The new commits and pull request title follow conventions explained in pull request guidelines (PRs that do not follow this convention will not be merged)

withcatai / node-llama-cpp

feat: async operations #178

Description of change

How to use node-llama-cpp after this change

Regular context

Embedding

Pull-Request Checklist