Closed huningxin closed 5 years ago
I first supported quantized version of Conv2D, DepthwiseConv2D, Averagepool, Reshape and Softmax in WASM backend, and modified the TFLite model loader to support quantized model. Then I tried Mobilenet_V2_1.0_224_quant model for image classification. And here is the result:
Quantized model:
Float model:
The class seems to be right, but still need to do some post-process with the probability. Another strange thing is that quantized model is slower than float model. Maybe I need some quantized version of test case to test if the ops work as expect.
Good progress! Please refer to TFLite demo code for post-processing. We need to make sure the result is correct at first step. Thanks!
Found Mobilenet_V2_1.0_224_quant model have no softmax, so add it automatically. After post-process, get the following result:
That's great, thanks @Wenzhao-Xiang !
We need to verify two aspects:
@huningxin Test Env: Chromium Version: nightly build 70.0.3503.0 (a7c5589) Platform: Android 9.0(Google Pixel 2XL)
Quantized model (Mobilenet_v2):
Float model (Mobilenet_v2):
Have about 2x speed up on WebNN/NNAPI.
I will next investigate why quantized model runs slower than float in WASM ops.
The float model number seems slower than what we collected previously. @Wenzhao-Xiang , could you please double-check with @BruceDai ?
And according to ai benchmark, pixel 3 has significant speedup on int8 quantized model. Please help test on that device. Thanks!
@huningxin Test with benchmark:
Chromium Version: nightly build 70.0.3503.0 (a7c5589) Platform: Android 9.0(Google Pixel 2XL) Quantized Mobilenet_V2: Inference Time: 79.18+-25.28 [ms] Float Mobilenet_V2: Inference Time: 109.41+-21.64 [ms]
Chromium Version: nightly build 70.0.3503.0 (a7c5589) Platform: Android 9.0(Google Pixel 3) Quantized Mobilenet_V2: Inference Time: 8.89+-1.25 [ms] Float Mobilenet_V2: Inference Time: 105.37+-20.90 [ms]
Summary: About 1.38x speed up in Google Pixel 2XL About 10x speed up in Google Pixel 3
Also supported ssd_mobilenet_v1_quant for object detection. Here is the test with examples: (Google Pixel 3) Image Classification:
Object Detection:
Really have a significant speedup with Google Pixel 3, even faster than WebGL backend with 1080TI, amazing me!
Impressed! Great job @Wenzhao-Xiang !
Please follow up as we discussed:
Thanks!
Done. Close it.
We need to investigate quantized model support for WebNN API.
Some TODOs in my mind: