[Web] Memory spike in ORT-web leading to app crash

abgoswam commented 1 year ago

Describe the issue

We have an ONNX model converted from LightGBM

"onnxruntime-web": "^1.15.0-dev.20230212-12d91173c4"
input is a fixed shape Tensor : float32_input: new Tensor('float32', features, [1, 221])

We observe a memory spike when using ONNX model

observation:

the above was run on WSL2, since most users complained they saw this on WSL2
- ONNX keeps consuming a lot of memory initially
- the model itself is very small (220 KB) , not sure why ONNX keeps raising the memory footprint
at some point (2K calls) ONNX de-allocates the memory

To reproduce

take the model and invoke it multiple times.

Urgency

yes. we plan to use ORT-web in 2 scenarios. hopefully we can find a fix/workaround for this soon

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

"onnxruntime-web": "^1.15.0-dev.20230212-12d91173c4"

Execution Provider

WASM

abgoswam commented 1 year ago

cc @fs-eire @xalili @Yi-Mao @isidorn

abgoswam commented 1 year ago

We also tried a tinybert model and observe the same "memory spike" behaviour:

elephantpanda commented 1 year ago

Is this related? #15080

fs-eire commented 1 year ago

Just to clarify - this issue is about using onnxruntime-web on Node.js, not inside browsers.

The root cause of this issue is identified as the environment being used is running Node.js v16.0.0 (WSL2), which uses the underlying V8 engine version 9.0. This is an old verson of V8, and there are a lot of improvement being made from the JS engine ( latest chromium based browsers uses V8 v11.1 ), and we see the memory usage is significantly reduced in latest V8.

I did a memory heap snapshot from devtoll by using --inspect commandline flag and validated that Node.js v16.0.0 will consume much more memory than expected to load a very simple ONNX model.

The conclusion is that we don't recommend to use onnxruntime-web on an old version of Node.js. Use onnxruntime-web on browsers are usually fine, as we assume users to use relatively newer version. Use onnxruntime-node on Node.js is a better option and in this case I verified that there is no memory issue using onnxruntime-node.

microsoft / onnxruntime