Closed Kirayue closed 5 years ago
Thank you for reaching us. I'm sorry we don't have a simple example yet.
You could follow the steps below to inference an ONNX model from scratch
npm i
& npm start
under the top directorytest_onnx.html
under the examples
, and copy this gist
https://gist.github.com/pinzhenx/4a1aa06b7750bca5a8b4b5a49245ef01Some caveats:
backend
and prefer
(line 15) are:
backend: 'WASM'
backend: 'WebGL'
backend: 'WebML', prefer: 'fast'
backend: 'WebML', prefer: 'sustained'
backend: 'WebML', prefer: 'low'
Conv
Relu
AveragePool
MaxPool
Concat
Dropout
GlobalAveragePool
Softmax
BatchNormalization
Add
Mul
Constant
Reshape
Flatten
Gemm
Sum
Unsqueeze
Hi, @pinzhenx thank you for your help, I have four questions.
backend: 'WebML'
, I got this error message when I used WebML
as backend.
Uncaught (in promise) Error: Fails to initialize neural network context
concat([x, -x])
, and -x
seems would be changed to Neg operator
. And load onnx model caused this error Neg is not supported.}
Is there any simple way to solve this or only to add operator like others? squeezenet1.1.onnx
, the same as the official example, but the inference time has a large gap. The time on the official example is about 15 ms, and the sample code as you mentioned is about 900 ms. Both use webgl as backend and prefer is low.I am very grateful for your reply.
I'm not sure why you want to convert from NHWC to NCHW as both pytorch and onnx follow the NCHW
memory format.
Although WebNN (WebML) uses NHWC
, if you are using the ONNX importer, you don't need to worry about the format. It will handle the data reordering for you. Just think of the imported model as an NHWC model. Suppose your input data is an image, the input tensor in JS will be an array of RGBRGBRGB...
in row major.
Enabling WebML backend requires our webml
branch of chromium:
This repo, as its name would suggest, is a JS implementation of the WebNN API. To get close-to-native performance, you would need the chromium with WebNN. For the reasons mentioned here, we cannot provide you with a nightly build at present, so you need to build it by yourself. We apologize for any inconvenience.
For the Neg
operator, you could give this patch a try. Please let me know if that doesn't work.
WebGL backend will allocate resources in its first run, so what you get is warm-up time. I would suggest measuring the time from the second call to the model.compute
.
const inputs = [new Float32Array(224*224*3)];
const outputs = [new Float32Array(1000)];
await model.compute(inputs, outputs); // dummy call
const start = performance.now();
await model.compute(inputs, outputs);
const inferenceTime = performance.now() - start;
``
hi, @pinzhenx
thank you for your help.
For 1. 4., I got it.
I will give 2. a try and the Neg operator
works fine.
Hi, @pinzhenx
thank you again
I have a question about Concat
.
it forces concatAxis = 3
when axis = 1 because of NHWC and forbid axis to other values. What if I want to concatenate along axis = 1 or other values?
@Kirayue Have a try, please https://github.com/intel/webml-polyfill/compare/master...pinzhenx:axis
@pinzhenx, it works, thank you for helping me so much. My results show that this is faster than onnx.js using the same backend (webgl), could you give me some references or documents about how this repo works?
@Kirayue Our WebGL backend is built on top of the tfjs-core. Please refer to their posts and papers. ONNX.js implemented the WebGL backend from scratch, so it's possible that there is a gap between them.
yeah, I saw you import tfjs-core
. According to onnx.js repo, it is faster than tf.js. And tf.js is also built on top of tfjs-core. So, I just wondered that if you use other technics to accelerate.
I will try chromium
next, I appreciate your help.
From our own benchmarks, we also wonder why tfjs is actually faster than the ONNX.js. On the chromium side, unfortunately, you may not fully exploit native performance as we haven't implemented TRANSPOSE on any platform yet, and there are some other ops are not implemented on some platforms. Please refer to supported ops
Ok, I got it. Thank you again for your help.
Hi, @pinzhenx It's me again, does the order of output tensor change using webml(backend: webgl)? I tested my own model using onnx.js and webml, the first 5 values are different. Onnx.js output is the same as pytorch output. Onnx.js:
0: 0.03076646849513054
1: -0.04884304106235504
2: -0.04319038987159729
3: -0.013033781200647354
4: 1.6304280757904053
5: -1.6730701923370361
Webml:
0: 0.03076649270951748
1: 0.05190211161971092
2: -0.0018133525736629963
3: -0.001568963285535574
4: 0.0013972646556794643
5: -0.0005790712311863899
The first element is almost the same but the followings are not, so I wondered whether the order is chanded?
const inputs = [new Float32Array(1024*1024*3).fill(0)]
const outputs = [new Float32Array(21824*6)]
Hi @Kirayue Pytorch and ONNX.js uses the NCHW format while WebNN uses the NHWC format. That's why our framework gave you a different answer. So if you want to achieve the same results as them, you have to post-precess the output tensors by yourself. Here's some ref https://github.com/pinzhenx/onnxjs/blob/webnn/lib/tensor.ts#L244
@Kirayue Here's the steps of building chromium if needed. https://github.com/intel/webml-polyfill/wiki#chromium-build-steps-with-webnn
As title, is there any sample code to show how to
I want to compare the performance between this proposal (webml) and onnx.js