withcatai / node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Force a JSON schema on the model output on the generation level
https://withcatai.github.io/node-llama-cpp/
MIT License
736 stars 63 forks source link

Loading Llama3 in Electron #212

Closed bitterspeed closed 2 months ago

bitterspeed commented 2 months ago

Issue description

Electron crashes when loadModel finishes loading (beta)

Expected Behavior

After loading a model using this code and trying to create a context, I'd expect there not to be a crash in llama-addon.node. I've tried with and without Metal enabled.

If a crash happens, I'd expect to be an error log from node-llama-cpp, but no error logs show up from node-llama-cpp either.

Actual Behavior

When creating context with llama 3 in Electron v28, electron crashes on runtime with an EXC_CRASH error (and no console logs)

-------------------------------------
Translated Report (Full Report Below)
-------------------------------------
Process:               Electron [76706]
Path:                  /Users/USER/Documents/*/Electron.app/Contents/MacOS/Electron
Identifier:            com.github.Electron
Version:               28.2.9 (28.2.9)
Code Type:             ARM-64 (Native)
Parent Process:        Exited process [76691]
Responsible:           Terminal [710]
User ID:               501

Date/Time:             2024-05-03 13:33:12.9471 -0700
OS Version:            macOS 14.2.1 (23C71)
Report Version:        12
Anonymous UUID:        9118B97C-7A81-646B-40E0-2AE653F176A4

Sleep/Wake UUID:       FE4FA3B9-0313-4913-85C1-72873A23C9D1

Time Awake Since Boot: 630000 seconds
Time Since Wake:       353677 seconds

System Integrity Protection: enabled

Crashed Thread:        0  CrBrowserMain  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_CRASH (SIGTRAP)
Exception Codes:       0x0000000000000000, 0x0000000000000000

Termination Reason:    Namespace SIGNAL, Code 5 Trace/BPT trap: 5
Terminating Process:   Electron [76706]

Thread 0 Crashed:: CrBrowserMain Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib                 0x185749af8 __kill + 8
1   Electron Framework                     0x10b600508 uv_kill + 12
2   Electron Framework                     0x11226441c node::OnFatalError(char const*, char const*) + 460292
3   ???                                    0x157e4fb74 ???
4   ???                                    0x157e4d728 ???
5   ???                                    0x157e4d728 ???
6   ???                                    0x150091aa8 ???
7   ???                                    0x157e4b108 ???
8   ???                                    0x157e4adf8 ???
9   Electron Framework                     0x10c950d80 v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 488

Steps to reproduce

  1. start default Electron project.
  2. Download model
  3. Use this code in src/main.ts to load a model.

(code below works fine in terminal node.js, which may suggest this is an issue on Electron's side, but I don't know enough to be sure, please let me know if you have ideas on where to start debugging.)

// In this file you can include the rest of your app's specific main process
// code. You can also put them in separate files and import them here.
async function main() {
  let { getLlama, LlamaChatSession, Llama3ChatWrapper } = await import(
    'node-llama-cpp'
  );

  const llama = await getLlama();
  try {
    console.log('Loading model');
    let model = await llama.loadModel({
      modelPath: path.join(
        app.getPath('sessionData'),
        `models`,
        'Meta-Llama-3-8B-Instruct-Q4_K_M.gguf'
      ),
      onLoadProgress(loadProgress: number) {
        console.log(`Load progress: ${loadProgress * 100}%`);
      },
    });
    const context = await model.createContext(); // crash happens here
    const session = new LlamaChatSession({
      contextSequence: context.getSequence(),
      chatWrapper: new Llama3ChatWrapper(),
    });
    const a1 = await session.prompt('test ');
    console.log(a1);
  } catch (e) {
    console.log(e);
  }
}
main();
  1. run the application

My Environment

Dependency Version
Operating System Sonoma 14.2.1
CPU Apple M1 Max
Node.js version v18.19
Typescript version 5.4.2
node-llama-cpp version 3.0.0-beta.17

Additional Context

No response

Relevant Features Used

Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, but I don't know how to start. I would need guidance.

giladgd commented 2 months ago

I have shared my investigation of this problem on the Electron issue about this matter: https://github.com/electron/electron/issues/41513#issuecomment-2094327194

github-actions[bot] commented 2 months ago

:tada: This issue has been resolved in version 3.0.0-beta.18 :tada:

The release is available on:

Your semantic-release bot :package::rocket: