withcatai / node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
https://node-llama-cpp.withcat.ai
MIT License
1.02k stars 94 forks source link

Electron sample app crash on Mac with speciffic model Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf #361

Closed sytolk closed 3 weeks ago

sytolk commented 1 month ago

Issue description

Electron sample app crash on Mac

Expected Behavior

no crash

Actual Behavior

crash

Steps to reproduce

Im testing with template https://github.com/withcatai/node-llama-cpp/tree/master/templates/electron-typescript-react and installed app .dmg file (the result is the same). After browse and choose model Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf Electron app crash with the log

My Environment

OS: macOS 23.4.0 (x64) Node: 18.18.2 (x64) node-llama-cpp: 3.1.1

Metal: not supported by llama.cpp on Intel Macs

CPU model: Intel(R) Core(TM) i5-8500B CPU @ 3.00GHz Math cores: 6 Used RAM: 99.24% (31.76GB/32GB) Free RAM: 0.75% (248.02MB/32GB)

Additional Context

No response

Relevant Features Used

Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, but I don't know how to start.

sytolk commented 1 month ago

It appears that crash depend of the model I dont have problem with gemma-2-2b-it-Q4_K_M.gguf file

giladgd commented 1 month ago

There seems to be an issue with one of the latest llama.cpp releases that introduced this crash only in Electron on macOS when Metal is not used (I managed to reproduce it on an M1 machine by building with Metal support disabled). It only happens on Electron, and probably has something to do with Electron's strict guards on memory allocation.

~From my tests, it seems that the llama.cpp release that's bundled with node-llama-cpp version 3.1.0 is not affected, so I recommend locking your package.json to that version until this issue is fixed.~ Update: It seems that version 3.1.0 is also affected.

I'll continue investigating the root cause of this.


To make it easier to reproduce this issue on Apple Silicone Macs, these command can be used:

mkdir repro
cd repro
npm init -y
npm install node-llama-cpp@3.1.1 electron
npx --no node-llama-cpp source build --gpu false
npx electron ./node_modules/node-llama-cpp/dist/cli/cli.js chat --prompt 'Hi there!' --gpu false
# and then select Llama 3.1 8B from the list of models
github-actions[bot] commented 3 weeks ago

:tada: This issue has been resolved in version 3.2.0 :tada:

The release is available on:

Your semantic-release bot :package::rocket: