[MPS] performance optimization

intel / webml-polyfill

Deprecated, the Web Neural Network Polyfill project has been moved to https://github.com/webmachinelearning/webnn-polyfill

Apache License 2.0

160 stars 42 forks source link

[MPS] performance optimization #332

Closed huningxin closed 5 years ago

huningxin commented 5 years ago

According to Apple* Machine Learning on Intel® Processor Graphics, the MobileNet inference performance is over 150 images/sec. It is faster than WebML/MPS inference performance. We need to investigate the data, identify potential gap and try to close it as much as possbile.

fujunwei commented 5 years ago

It only includes GPU time with l.2596 - l.2599 in main.mm, below are related code

commit = mach_absolute_time();
[cmdBufs[last] waitUntilCompleted];
complete = mach_absolute_time();
elapsedTime = complete - commit;

fujunwei commented 5 years ago

The default testing is random data that is fast as reported seeing main.mm l.1593, but the data is not a real image. Set gGetPrediction variable true to test final0.jpg, the time of predicting is similar with WebML/MPS, so i don't think it's a gap we need to fill.

fujunwei commented 5 years ago

The predicted time is 13ms using coreml model, but webml is 16ms with mps backend.