Open starsky opened 1 year ago
It's possible that the underlying code is slower on simple or small shapes. Does your targeted
model have 1x1 convolution? Or 1x1 convolution is just created for test? If so, I'd suggest focusing on convolution op in the targeted
model.
@wschin
I did the same experiment with 3x3 convs. And I got similar speed difference (4x slower). I did test for 1x1 and 3x3 convs, because they are common building blocks for most of the architectures.
Also the the size of the input tensor is 256x144
which is a realistic input size.
webassembly is slower than native. Depends a lot on what you do but 3-5 is common. For conv the main reasons are that simd in wasm is 128-bit and much more effort has gone into optimizing the conv in native. We will be doing some work to address the later soon. But I'm not sure how much we can get out of this - I'd expect a factor of 2-3 slower.
Describe the issue
I noticed am trying to optimize my models for WebAssembly ONNX Runtime. I ran some test regarding the Conv operation speed difference between Web and Native ONNX Runtime.
I create a model that does 1x1 conv. And progressively add more 1x1 conv layers from 1 to 50. I measure inference time for native and WebAssembly. I estimated that on my machine some constatnt operations (eg. data loading) are ~ 0.17 ms vs 0.3 ms for web.
But time for single 1x1 conv layer is 0.026 ms for native vs 0.1 ms for web. Whis is almost 4x slower. Is it expected? Or are there ways to improve the speed? Cause model is very simple, and I used ONNX Simplifier to optimize the model. I struggle to find what kind of lost of performance is expected in documentation.
To reproduce
example.onnx.gz
Here is 50 layer conv 1x1 model. In my case it is 3.75 times slower on web than in native ONNX runtime.
I use this code to run the model on web: https://github.com/microsoft/onnxruntime-inference-examples/tree/main/js/quick-start_onnxruntime-web-script-tag
Urgency
No response
Platform
Web Browser
OS Version
Linux
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
ONNX Runtime Web v1.14.0
ONNX Runtime API
JavaScript
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Model File
No response
Is this a quantized model?
No