microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.67k stars 2.93k forks source link

[Web] Memory spike in ORT-web leading to app crash #15086

Open abgoswam opened 1 year ago

abgoswam commented 1 year ago

Describe the issue

We have an ONNX model converted from LightGBM

We observe a memory spike when using ONNX model

image

observation:

To reproduce

take the model and invoke it multiple times.

Urgency

yes. we plan to use ORT-web in 2 scenarios. hopefully we can find a fix/workaround for this soon

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

"onnxruntime-web": "^1.15.0-dev.20230212-12d91173c4"

Execution Provider

WASM

abgoswam commented 1 year ago

cc @fs-eire @xalili @Yi-Mao @isidorn

abgoswam commented 1 year ago

We also tried a tinybert model and observe the same "memory spike" behaviour:

image

elephantpanda commented 1 year ago

Is this related? #15080

fs-eire commented 1 year ago

Just to clarify - this issue is about using onnxruntime-web on Node.js, not inside browsers.

The root cause of this issue is identified as the environment being used is running Node.js v16.0.0 (WSL2), which uses the underlying V8 engine version 9.0. This is an old verson of V8, and there are a lot of improvement being made from the JS engine ( latest chromium based browsers uses V8 v11.1 ), and we see the memory usage is significantly reduced in latest V8.

I did a memory heap snapshot from devtoll by using --inspect commandline flag and validated that Node.js v16.0.0 will consume much more memory than expected to load a very simple ONNX model.

The conclusion is that we don't recommend to use onnxruntime-web on an old version of Node.js. Use onnxruntime-web on browsers are usually fine, as we assume users to use relatively newer version. Use onnxruntime-node on Node.js is a better option and in this case I verified that there is no memory issue using onnxruntime-node.