withcatai / node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Force a JSON schema on the model output on the generation level
https://node-llama-cpp.withcat.ai
MIT License
829 stars 80 forks source link

bug: model parameter `threads` doesn't work #114

Closed pafik13 closed 7 months ago

pafik13 commented 9 months ago

Issue description

It seems to me that parameter threads doesn't work as expected

Expected Behavior

If I have 24 CPUs and pass threads:24 then all CPUs should be utilized. II tried call original llama.cpp with argument -t 24 and it works normally as expected.

Actual Behavior

I pass parameter thread: 24 or 1 to constructor and nothing is changed: always it starts utilize 4 CPUs upper 80% and sometimes use 1-2 additional with 25-50% utilization.

Steps to reproduce

Try to pass different threads value to model constructor and observe CPUs utilization (for example, htop)

My Environment

Dependency Version
Operating System Ubuntu 22.04.3 LTS
CPU AMD EPYC 7742 64-Core Processor
Node.js version v20.10.0
Typescript version no use
node-llama-cpp version 3.0.0-beta.1 & v2.8.1

Additional Context

./llama.cpp/main -m ./catai/models/phind-codellama-34b-q3_k_s -p "Please, write JavaScript function to sort array" -ins -t 24

Screenshot from 2023-12-07 23-03-02

const model = new LlamaModel({
    modelPath: "/root/catai/models/phind-codellama-34b-q3_k_s",
    threads: 1,
});

Screenshot from 2023-12-07 23-04-41

const model = new LlamaModel({
    modelPath: "/root/catai/models/phind-codellama-34b-q3_k_s",
    threads: 24,
});

Screenshot from 2023-12-07 23-05-59

Relevant Features Used

Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, but I don't know how to start. I would need guidance.

giladgd commented 9 months ago

@pafik13 Thanks for the detailed issue, it helped me a lot to investigate this problem :) I found the bug, I'll include the fix for it in the next beta

stewartoallen commented 8 months ago

@giladgd can you point me to where you found the problem? if I can fix it locally, I can submit a PR.

github-actions[bot] commented 7 months ago

:tada: This issue has been resolved in version 3.0.0-beta.2 :tada:

The release is available on:

Your semantic-release bot :package::rocket:

github-actions[bot] commented 7 months ago

:tada: This issue has been resolved in version 3.0.0-beta.4 :tada:

The release is available on:

Your semantic-release bot :package::rocket: