Atome-FE / llama-node

Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.
https://llama-node.vercel.app/
Apache License 2.0
865 stars 63 forks source link

Bumping from 0.1.5 to 0.1.6 resulting with `Error: invariant broken` #90

Open NitayRabi opened 1 year ago

NitayRabi commented 1 year ago

Got an LLM running with GPT4All models (tried with ggml-gpt4all-j-v1.3-groovy.bin and ggml-gpt4all-l13b-snoozy.bin).

Version 0.1.5: - Works Version 0.1.6 - Results with with Error: invariant broken: 999255479 <= 2 in Some("{PATH_TO}/ggml-gpt4all-j-v1.3-groovy.bin")

Package versions:

"@llama-node/core": "0.1.6",
"@llama-node/llama-cpp": "0.1.6",
"llama-node": "0.1.6",
/* eslint-disable @typescript-eslint/no-unused-vars */
/* eslint-disable @typescript-eslint/no-var-requires */
import { ModelType } from '@llama-node/core';
import { LLM } from 'llama-node';
// @ts-expect-error
import { LLMRS } from 'llama-node/dist/llm/llm-rs.cjs';
import path from 'path';

const modelPath = path.join(
  __dirname,
  '..',
  'models',
  'ggml-gpt4all-j-v1.3-groovy.bin',
);
const llama = new LLM(LLMRS);

const toChatTemplate = (prompt: string) => `### Instruction:
${prompt}

### Response:`;

export const createCompletion = async (
  prompt: string,
  onData: (data: string) => void,
  onDone: () => void,
) => {
  const params = {
    prompt: toChatTemplate(prompt),
    numPredict: 128,
    temperature: 0.8,
    topP: 1,
    topK: 40,
    repeatPenalty: 1,
    repeatLastN: 64,
    seed: 0,
    feedPrompt: true,
  };
  await llama.load({ modelPath, modelType: ModelType.GptJ });
  await llama.createCompletion(params, (response) => {
    if (response.completed) {
      return onDone();
    } else {
      onData(response.token);
    }
  });
};