Open Christywl opened 4 years ago
Allocates a new MTLBuffer and Maps a new memory from SharedBufferMapping of a given length caused the regression, maybe we need to apply the optimization of reducing map new memory to other backend implementation.
That's a good optimization. Thanks @fujunwei !
maybe we need to apply the optimization of reducing map new memory to other backend implementation.
Do you know which backend this optimization can be used for?
https://github.com/otcshare/chromium-src/pull/18 merged. @Christywl , please help verify the performance regression issue. Thanks.
Do you know which backend this optimization can be used for?
Map new memory including BNNS, CLDNN, DML, DNNL and Inference Engine implementation.
Map new memory including BNNS, CLDNN, DML, DNNL and Inference Engine implementation.
Great, could you please file an issue to track? Thanks.
Done with issue https://github.com/intel/webml-polyfill/issues/1140. Thanks.
@fujunwei , the performance of these models on the latest build https://github.com/otcshare/chromium-src/commit/775b7014eff6866bcad4921d61b41ac0d1c98d54 still has an issue. Ran several times, the numbers are not stable (including MobileNet v2 models):
Models | Inference Time(ms) | Inference Time(ms) | Inference Time(ms) | Inference Time(ms) | Inference Time(ms) | Inference Time(ms) |
---|---|---|---|---|---|---|
MobileNet v1(TFLite) | 13.28+-2.59 | 12.18+-1.16 | 15.44+-1.31 | 11.98+-1.08 | 13.11+-1.26 | 12.38+-1.37 |
MobileNet v2(TFLite) | 16.77+-1.26 | 15.64+-1.05 | 12.29+-1.50 | 15.66+-0.98 | 13.04+-1.66 | 15.07+-0.93 |
SqueezeNet(TFLite) | 12.88+-1.69 | 16.36+-1.06 | 13.33+-1.38 | 16.03+-0.59 | 15.84+-0.84 | 13.24+-1.59 |
MobileNet v2(ONNX) | 17.16+-1.66 | 15.87+-1.21 | 17.09+-0.71 | 15.90+-1.16 | 17.21+-0.79 | 15.81+-1.11 |
MobileNet v1(OpenVINO) | 14.47+-3.87 | 12.66+-1.34 | 11.34+-1.03 | 15.15+-0.91 | 12.31+-1.15 | 12.24+-0.98 |
MobileNet v2(OpenVINO) | 16.36+-0.52 | 15.59+-1.60 | 12.18+-1.06 | 14.39+-1.78 | 15.73+-0.45 | 12.47+-1.36 |
Inception v2(OpenVINO) | 20.98+-2.31 | 22.75+-4.75 | 20.69+-1.80 | 23.12+-4.44 | 20.97+-2.61 | 20.99+-2.80 |
SSD MobileNet v1(TFLite) | 30.72+-1.05 Decode Time: 0.46+-1.11 | 21.64+-3.19 Decode Time: 0.45+-1.13 | 23.58+-2.25 Decode Time: 0.45+-1.11 | 21.86+-2.65 Decode Time: 0.45+-1.16 | 31.39+-1.24 Decode Time: 0.44+-1.11 | 21.30+-3.30 Decode Time: 0.45+-1.08 |
SSD MobileNet v2(TFLite) | 34.98+-2.26 Decode Time: 0.46+-0.95 | 36.07+1.49 Decode Time: 0.46+-1.12 | 33.88+1.26 Decode Time: 0.46+-1.14 | 35.11+0.75 Decode Time: 0.49+-1.10 | 33.82+0.83 Decode Time: 0.47+-1.14 | 34.04+1.64 Decode Time: 0.48+-1.11 |
SSDLite MobileNet v2(TFLite) | 33.26+-0.94 Decode Time: 0.45+-1.19 | 33.41+-1.07 Decode Time: 0.49+-1.12 | 33.84+-0.54 Decode Time: 0.44+-1.11 | 32.90+-0.53 Decode Time: 0.48+-1.13 | 33.80+-0.51 Decode Time: 0.47+-1.11 | 32.67+-0.56 Decode Time: 0.45+-1.11 |
PoseNet | 16.23+-0.82 Decode time: 3.43+-1.59 | 17.22+-1.88 Decode time: 3.30+-1.62 | 16.91+-2.07 Decode time: 3.25+-1.59 | 17.59+-1.19 Decode time: 3.38+-1.60 | 16.36+-1.67 Decode time: 3.25+-1.65 | 16.73+-2.08 Decode time: 3.32+-1.63 |
While, other models look good, for example:
Models | Inference Time(ms) | Inference Time(ms) | Inference Time(ms) |
---|---|---|---|
Inception v3(TFLite) | 43.47+-2.42 | 43.67+-2.61 | 43.78+-2.70 |
Inception v4(TFLite) | 79.51+-3.04 | 79.86+-3.08 | 79.87+-3.14 |
SqueezeNet(ONNX) | 12.03+-2.06 | 11.93+-0.70 | 11.83+-0.66 |
Resnet50 v1(ONNX) | 34.35+-2.15 | 34.36+-1.99 | 34.36+-2.07 |
SqueezeNet(OpenVINO) | 12.05+-1.73 | 11.93+-0.66 | 11.92+-0.69 |
So I reopen this issue, please take a look.
Test Env: Chromium Version: nightly build 79.0.3917.0(https://github.com/otcshare/chromium-src/commit/270f639f9ab6be2eeeb86fb9ab82930cf94fb60a) Platform: macOS 10.14.5
Expected Result: No regression was found.
Actual Result: Performance regression happened for some models on MPS backend, some commit between https://github.com/otcshare/chromium-src/commit/4e4a1a3c08ac0dc84d9d313c50f14e56c82dca8c and https://github.com/otcshare/chromium-src/commit/270f639f9ab6be2eeeb86fb9ab82930cf94fb60a caused this issue:
How to Reproduce: