intel / webml-polyfill

Deprecated, the Web Neural Network Polyfill project has been moved to https://github.com/webmachinelearning/webnn-polyfill
Apache License 2.0
161 stars 42 forks source link

[WASM] The compilation time became larger than WASM(TFLite) for most models #1261

Closed Christywl closed 4 years ago

Christywl commented 4 years ago

Test Env: webml-polyfill commit: https://github.com/intel/webml-polyfill/commit/a58f9f5340ca87eca4cda18db019f42752317f9e Platform: Windows(Dell XPS 13, Intel i5-8250U)

Actual Result: The compilation time for TF.js WASM(https://github.com/intel/webml-polyfill/commit/a58f9f5340ca87eca4cda18db019f42752317f9e) became larger than WASM(TFLite) (https://github.com/intel/webml-polyfill/commit/011a7f64fc65edf40c31a1d294aeefba48e4633d) for most models: For example:

  WASM(TFLite) WASM(TF.js)
MobileNet v1(TFLite) 76.76 ms 116.27 ms
MobileNet v2(TFLite) 79.04 ms 106.48 ms
MobileNet v2(ONNX) 82.98 ms 112.88 ms
ResNet50 v1(ONNX) 166.68 ms 293.15 ms

How to Reproduce:

  1. Setup server with commit https://github.com/intel/webml-polyfill/commit/a58f9f5340ca87eca4cda18db019f42752317f9e
  2. launch chrome or chromium(disable webml)
  3. Visit http://localhost:8080/examples/image_classification
  4. Select one model
  5. Open Inspect and check compilation time in the console
akineeic commented 4 years ago

The compilation time mainly records the time to generate operands that can be used as parameter to excute operation. Different backends generate different operand and tfjs backend usually takes more time in this step. And tfjs need an extra step that change the format of kernel in here. Sometimes this step will consume a lot of time. For example, I simply tested with DenseNet 121 ONNX. The tfjs wasm backend needs 175ms to convert operands and 142ms to change format while the previous tflite backend needs 112ms to convert operands.