Atome-FE / llama-node

Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.
https://llama-node.vercel.app/
Apache License 2.0
862 stars 62 forks source link

Error: Too many tokens predicted #81

Closed dhd5076 closed 1 year ago

dhd5076 commented 1 year ago

Why is this being thrown as an error?

https://github.com/Atome-FE/llama-node/blob/649457a2528b6d3b2ef5d35e0507c4022419973a/packages/llama-cpp/src/llama.rs#L120

Forgive my ignorance if I'm missing something, but seems like this should just set response.completed = true;

Context for error: prompt is a String, ex "Hello World, " tokens is Integer, ex 128

this.model.createCompletion({
    nThreads: 4,
    nTokPredict: tokens,
    topK: 40,
    topP: 0.1,
    temp: 0.2,
    repeatPenalty: 1,
    prompt,
}, (response) => {
    completition.push(response.token)
    console.log(response.token)
    if(response.completed || count == tokens) {
        resolve(completition)
        return;
    }
}).catch((error) => {
    reject(new Error("Failed to generate completition: " + error));
});   

Again, if I'm not missing something it seems clunky to have to check it the error is actually just because we have hit the end token.

dhd5076 commented 1 year ago

Installed updated version compiled direct from repo instead of npm registry and it seems to have resolved the issue.