Is there any tutorial to test our own model?

Kirayue commented 5 years ago

As title, is there any sample code to show how to

load model
inference
measure time

I want to compare the performance between this proposal (webml) and onnx.js

pinzhenx commented 5 years ago

Thank you for reaching us. I'm sorry we don't have a simple example yet.

You could follow the steps below to inference an ONNX model from scratch

run npm i& npm start under the top directory
create new file test_onnx.html under the examples, and copy this gist https://gist.github.com/pinzhenx/4a1aa06b7750bca5a8b4b5a49245ef01

Some caveats:

Valid combinations of backend and prefer (line 15) are:

backend: 'WASM'
backend: 'WebGL'
backend: 'WebML', prefer: 'fast'
backend: 'WebML', prefer: 'sustained'
backend: 'WebML', prefer: 'low'

ONNX importer currently only supports the following ops:

Conv
Relu
AveragePool
MaxPool
Concat
Dropout
GlobalAveragePool
Softmax
BatchNormalization
Add
Mul
Constant
Reshape
Flatten
Gemm
Sum
Unsqueeze

Kirayue commented 5 years ago

Hi, @pinzhenx thank you for your help, I have four questions.

due to my model is trained using pytorch, so is there any way to change input format from NHWC to NCHW
Should I enable or configure something to use backend: 'WebML', I got this error message when I used WebML as backend. Uncaught (in promise) Error: Fails to initialize neural network context
My own model has a operation concat([x, -x]), and -x seems would be changed to Neg operator . And load onnx model caused this error Neg is not supported.} Is there any simple way to solve this or only to add operator like others?
I test it using squeezenet1.1.onnx, the same as the official example, but the inference time has a large gap. The time on the official example is about 15 ms, and the sample code as you mentioned is about 900 ms. Both use webgl as backend and prefer is low.

I am very grateful for your reply.

pinzhenx commented 5 years ago

I'm not sure why you want to convert from NHWC to NCHW as both pytorch and onnx follow the NCHW memory format. Although WebNN (WebML) uses NHWC, if you are using the ONNX importer, you don't need to worry about the format. It will handle the data reordering for you. Just think of the imported model as an NHWC model. Suppose your input data is an image, the input tensor in JS will be an array of RGBRGBRGB... in row major.
Enabling WebML backend requires our webml branch of chromium: This repo, as its name would suggest, is a JS implementation of the WebNN API. To get close-to-native performance, you would need the chromium with WebNN. For the reasons mentioned here, we cannot provide you with a nightly build at present, so you need to build it by yourself. We apologize for any inconvenience.
For the Neg operator, you could give this patch a try. Please let me know if that doesn't work.

WebGL backend will allocate resources in its first run, so what you get is warm-up time. I would suggest measuring the time from the second call to the model.compute.


const inputs = [new Float32Array(224*224*3)];
const outputs = [new Float32Array(1000)];
await model.compute(inputs, outputs);  // dummy call

const start = performance.now();
await model.compute(inputs, outputs);
const inferenceTime = performance.now() - start;
``

Kirayue commented 5 years ago

hi, @pinzhenx thank you for your help. For 1. 4., I got it. I will give 2. a try and the Neg operator works fine.

pinzhenx commented 5 years ago

@Kirayue Try it https://github.com/intel/webml-polyfill/compare/master...pinzhenx:transpose_onnx

Kirayue commented 5 years ago

Hi, @pinzhenx thank you again I have a question about Concat. it forces concatAxis = 3 when axis = 1 because of NHWC and forbid axis to other values. What if I want to concatenate along axis = 1 or other values?

pinzhenx commented 5 years ago

@Kirayue Have a try, please https://github.com/intel/webml-polyfill/compare/master...pinzhenx:axis

Kirayue commented 5 years ago

@pinzhenx, it works, thank you for helping me so much. My results show that this is faster than onnx.js using the same backend (webgl), could you give me some references or documents about how this repo works?

pinzhenx commented 5 years ago

@Kirayue Our WebGL backend is built on top of the tfjs-core. Please refer to their posts and papers. ONNX.js implemented the WebGL backend from scratch, so it's possible that there is a gap between them.

Kirayue commented 5 years ago

yeah, I saw you import tfjs-core. According to onnx.js repo, it is faster than tf.js. And tf.js is also built on top of tfjs-core. So, I just wondered that if you use other technics to accelerate. I will try chromium next, I appreciate your help.

pinzhenx commented 5 years ago

From our own benchmarks, we also wonder why tfjs is actually faster than the ONNX.js. On the chromium side, unfortunately, you may not fully exploit native performance as we haven't implemented TRANSPOSE on any platform yet, and there are some other ops are not implemented on some platforms. Please refer to supported ops

Kirayue commented 5 years ago

Ok, I got it. Thank you again for your help.

Kirayue commented 5 years ago

Hi, @pinzhenx It's me again, does the order of output tensor change using webml(backend: webgl)? I tested my own model using onnx.js and webml, the first 5 values are different. Onnx.js output is the same as pytorch output. Onnx.js:

0: 0.03076646849513054
1: -0.04884304106235504
2: -0.04319038987159729
3: -0.013033781200647354
4: 1.6304280757904053
5: -1.6730701923370361

Webml:

0: 0.03076649270951748
1: 0.05190211161971092
2: -0.0018133525736629963
3: -0.001568963285535574
4: 0.0013972646556794643
5: -0.0005790712311863899

The first element is almost the same but the followings are not, so I wondered whether the order is chanded?

model.onnx

const inputs = [new Float32Array(1024*1024*3).fill(0)]
const outputs = [new Float32Array(21824*6)]

pinzhenx commented 5 years ago

Hi @Kirayue Pytorch and ONNX.js uses the NCHW format while WebNN uses the NHWC format. That's why our framework gave you a different answer. So if you want to achieve the same results as them, you have to post-precess the output tensors by yourself. Here's some ref https://github.com/pinzhenx/onnxjs/blob/webnn/lib/tensor.ts#L244

pinzhenx commented 5 years ago

@Kirayue Here's the steps of building chromium if needed. https://github.com/intel/webml-polyfill/wiki#chromium-build-steps-with-webnn

intel / webml-polyfill

Is there any tutorial to test our own model? #777