Unpredictable onnxruntime-node crash when using Electron

NexelOfficial commented 6 months ago

Describe the issue

I'm using onnxruntime-node in an Electron project. I'm trying to implement a multithreading solution with Node Worker Threads, but when creating a bunch of worker threads in Electron, they randomly crash when importing onnxruntime-node. The issue is unpredictable, sometimes it happens, sometimes it doesn't. The more Workers created, the more frequently the crash occurs. Here is the full crash:

[7848:0326/114947.582:ERROR:crashpad_client_win.cc(868)] not connected

 ELIFECYCLE  Command failed with exit code 4294930435.

After this, the app exits. Again, this is all confusing to me because it's random whether the crash will occur.

To reproduce

Clone my example repo that demonstrates this issue: https://github.com/NexelOfficial/electron-onnx-workers.
Install dependencies using pnpm or npm.
Run the app using electron . (or pnpm start / npm run start)
If the error doesn't occur, please increase the number of Workers in index.js like in the example below. Again, this error is unpredictable so it can take some tries before it occurs.
```
for (let i = 0; i < WORKER_AMOUNT_GOES_HERE; i++) {
new Worker(path.join(__dirname, "onnxWorker.js"));
}
```
Urgency

Project deadline in about 4 weeks

System information

Platform

Windows 10

OS Version

Pro Edition x64

ONNX Runtime Installation

onnxruntime-node

ONNX Runtime Version or Commit ID

1.17.0

ONNX Runtime API

Not provided

Architecture

Not provided

Execution Provider

Not provided

Execution Provider Library Version

Not provided

NexelOfficial commented 6 months ago

Update: The error seems to be gone when spawning the threads one after the other, and waiting for the model to load. Make the following changes to replicate the fix:

In your worker, add the following code:


// Load your model
await onnxruntime.InferenceSession.create("model/yolov8n-pose.onnx", {
enableMemPattern: false,
intraOpNumThreads: 1,
});

// Add this line here. It will send a message to the main thread that the next worker can be loaded. parentPort.postMessage({ message: "ONNX_READY" });

2. In your main thread, add the following code:
```js
// Keep a list of all workers
const workers = [];

const createWorker = () => {
  const worker = new Worker(path.join(__dirname, "onnxWorker.js"));
  workers.push(worker);

  // Create another worker when the previous one is loaded and more are needed
  worker.on("message", (data) => {
    // Continue if there are not 8 workers yet (change to however many you need)
    if (data.message === "ONNX_READY" && workers.length < 8) {
      return createWorker();
    }
  });
};

app.whenReady().then(() => {
  // ... other code

  // Start creating workers when your app is loaded
  createWorker();
});

You might need to add some extra code that will wait for all workers to start as it will take some time for all the workers to be created. I have not seen the error on any of my three machines since, and I hope this will help others out as well.

github-actions[bot] commented 5 months ago

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

microsoft / onnxruntime

Unpredictable onnxruntime-node crash when using Electron #20084

Describe the issue

To reproduce

Urgency

System information

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version