Closed sabagithub closed 1 year ago
This is related to the context limit, you can try to extend it.
catai config --edit nano
change the nCtx
to something bigger, for example 4096
export const SETTINGS_NODE_LLAMA = {
enableLogging: false,
nCtx: 4096,
nParts: -1,
seed: 0,
f16Kv: false,
logitsAll: false,
vocabOnly: false,
useMlock: false,
embedding: false,
useMmap: false,
nGpuLayers: 3,
};
@ido-pluto How do you change the config
now, i have the same issue but looks like the config
command was removed, how to archive the same?
You can change the config via the settings button in the webui
@ido-pluto i just editted it to this, but it ignore my changes
{
"bind": "...",
"nCtx": 10000,
"n_ctx": 10000
}
There is a configuration guide in the readme, check it out.
The option you are looking for is here: https://withcatai.github.io/node-llama-cpp/api/type-aliases/LlamaContextOptions
@ido-pluto i just added, but now i get this error
{
"bind": "...",
"contextSize": 10000,
"batchSize": 10000
}
GGML_ASSERT: /home/runner/work/node-llama-cpp/node-llama-cpp/llama/llama.cpp/ggml-backend.c:519: data != NULL && "failed to allocate buffer"
This is too large, to reset the model you can simply delete and reinstall it
If I ask "Please write a summary of all the countries in the world in alphabetical order. Include in each summary the country's population and population density.", it will write about 1000 tokens, then it'll just shut down, and the UI will lose the connection.
I was using the Stable Vicuna model 13B on 16GB of ram.
If you don't experience this issue, then I think this can be closed, as it's probably just my system's limitation.