withcatai / node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Force a JSON schema on the model output on the generation level
https://withcatai.github.io/node-llama-cpp/
MIT License
760 stars 65 forks source link

Could not find a KV slot #136

Closed Zambonilli closed 6 months ago

Zambonilli commented 6 months ago

Issue description

In version 2.8.3 I am receiving "could not find a KV slot for the batch (try reducing the size of the batch or increase the context)" error when performing a bunch of zero-shot prompts in a loop with the same context.

Expected Behavior

To be able to perform zero-shot prompts in a loop without failure.

Actual Behavior

I'm receiving the following error after a few iterations regardless of the number of layers or if I create multiple contexts.

Steps to reproduce

  1. follow instructions on my OSS project for setting up genai-gamelist
  2. update package.json to use 2.8.3 from 3.x beta for node-llama-cpp
  3. run the script

My Environment

Dependency Version
Operating System Ubuntu 22.04 LTS
CPU AMD Ryzen 9 5900X
Node.js version 18.x
Typescript version 5.3.3
node-llama-cpp version 2.8.3

Additional Context

With only upgrading to the 3.x beta, I am now unable to reproduce the issue with the same script.

Relevant Features Used

Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, but I don't know how to start. I would need guidance.

giladgd commented 6 months ago

@Zambonilli I originally thought you encountered this issue in the version 3 beta. I'm aware of this issue in version 2.x, and I can confirm that version 3 is supposed to fix it. It's a good sign that you don't encounter it anymore when using version 3.

If you encounter it on the version 3 beta then let me know so I can investigate it.

sagarjgborg commented 6 months ago

The error still persist in node-llama-cpp@3.0.0-beta.1

[Error: could not find a KV slot for the batch (try reducing the size of the batch or increase the context)]

giladgd commented 6 months ago

@sagarjgborg Please provide a code to reproduce this issue on the version 3 beta with a link to the model file to use for the code

sagarjgborg commented 6 months ago

Here is my repo https://github.com/sagarjgborg/kv_slot_issue.git . Just do few API calls continuously you will see error.

Zambonilli commented 6 months ago

I think you're getting this error because you're explicitly setting context size to 1024.

sagarjgborg commented 6 months ago

@Zambonilli even I noticed that, but why is it so. Can I set larger context size for model trained on 4096? for example context size as 8192?