Atome-FE / llama-node

Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.
https://llama-node.vercel.app/
Apache License 2.0
854 stars 62 forks source link

Error: Missing field `nGpuLayers` #80

Open bakiwebdev opened 1 year ago

bakiwebdev commented 1 year ago

Hello guys, i try to run mpt-7b model , and i am getting this code, i appreciate any help, here is the detail

Node.js v19.5.0

node_modules\llama-node\dist\llm\llama-cpp.cjs:82 this.instance = yield import_llama_cpp.LLama.load(path, rest, enableLogging); ^

Error: Missing field `nGpuLayers` at LLamaCpp.<anonymous> (<path>\node_modules\llama-node\dist\llm\llama-cpp.cjs:82:52) at Generator.next (<anonymous>) at <path>\node_modules\llama-node\dist\llm\llama-cpp.cjs:50:61 at new Promise (<anonymous>) at __async (<path>\node_modules\llama-node\dist\llm\llama-cpp.cjs:34:10) at LLamaCpp.load (<path>\node_modules\llama-node\dist\llm\llama-cpp.cjs:80:12) at LLM.load (<path>\node_modules\llama-node\dist\index.cjs:52:21) at run (file:///<path>/index.mjs:27:17) at file:///<path>/index.mjs:42:1 at ModuleJob.run (node:internal/modules/esm/module_job:193:25) { code: 'InvalidArg' }

folder structure image

index.mjs

import { LLM } from "llama-node";
import { LLamaCpp } from "llama-node/dist/llm/llama-cpp.cjs"
import path from "path"

const model = path.resolve(process.cwd(), "./model/ggml-mpt-7b-base.bin");
const llama = new LLM(LLamaCpp);
const config = {
    path: model,
    enableLogging: true,
    nCtx: 1024,
    nParts: -1,
    seed: 0,
    f16Kv: false,
    logitsAll: false,
    vocabOnly: false,
    useMlock: false,
    embedding: false,
    useMmap: true,
};

const template = `How are you?`;
const prompt = `A chat between a user and an assistant.
USER: ${template}
ASSISTANT:`;

const run = async () => {
    await llama.load(config);

    await llama.createCompletion({
        nThreads: 4,
        nTokPredict: 2048,
        topK: 40,
        topP: 0.1,
        temp: 0.2,
        repeatPenalty: 1,
        prompt,
    }, (response) => {
        process.stdout.write(response.token);
    });
}

run();

thank you for your time

matthoffner commented 1 year ago

I don't think the MPT models work with Llama.cpp at this point? https://github.com/ggerganov/llama.cpp/issues/1333

I know there is ggml-js. Maybe there are others.

I'm using Python and ctransformers to try out new ggml models. I have a boilerplate for it here: https://huggingface.co/spaces/matthoffner/ggml-ctransformers-fastapi

dhd5076 commented 1 year ago

Can't speak to whether MPT models work, but to address that error message directly, this is the config I am using


const modelFileLocation = path.resolve(proccess.cwd() + 'ggml-vic7b-q5_1.bin');
const config = {
    modelPath: modelFileLocation,
    enableLogging: true,
    nCtx: 1024,
    seed: 0,
    f16Kv: false,
    logitsAll: false,
    vocabOnly: false,
    useMlock: false,
    embedding: false,
    useMmap: true,
    nGpuLayers: 32
};
this.config = config;

note the missing nGpuLayers field.

nGpuLayers can be set to 0 if you don't want to use cuBLAS or if you have not compiled with BLAS 32 passes off the maximum number of layers to GPU

DavidBDiligence commented 1 year ago

This is the First example in the documentation.

hlhr202 commented 1 year ago

sorry guys i m in vacation this week and i havnt update example yet. nGpuLayer is for cuda build only. you will just need to pass 0 if your are not using cuda.

hlhr202 commented 1 year ago

add documentation for first example.

bakiwebdev commented 1 year ago

Thank you for your support guys, it's working now