PaddleOnACL

This project is still under development. Sorry for the ugly code and plenty of bugs. Welcome to contribute!

PaddleOnACL is a work during my arm intern. It aims at porting paddle's CAPI onto ArmComputeLibrary instead of MKL or OpenBlas library, seeking for performance gain of deep learning applicaiton at mobile and embedded devices.

For now(2018.04.11), it is based on paddle's develop branch at this commit and arm ComputeLibrary v18.03.

Tutorial

Installation instructions
Inference benchmark demo
Mobile AI Camera App Demo

Konwn issues:

Crash when running conv13 of MobileNet, #1
With only activation and softmax layer enabled, crashed at 2nd iteration #2
Crash when running conv2_1 of vgg_ssd_net, #3

Pending work:

[ ] Merge from latest paddle develop branch
[ ] Port BatchNormlization layer
[ ] Port Sigmoid Layer
[ ] Port TanH Layer
[ ] Define Bypass variable to enable and disable ACL layer
[ ] Add standard and optimized log info
[ ] Add macro definition to switch between neon/opencl and gemmConv/directConv

Benchmark

Note:

All the data is tested under Debug mode, it should be smaller when build with Release/MinSizeRel.
The unit is millisecond.
Blank is TODO
The data was collected at normal inference before crash mentioned above

Paddle/PaddleOnACL on Raspberry Pi 3

	init paddle	creat model	1st run	2nd~10th avg
MobileNet	3.0/3.0	153/163	2769/	2596/
SSD	3.8/3.0	5403/5380
VGG16
AlexNet

(Just found AlexNet example here)

Paddle/PaddleOnACL on HUAWEI Mate10 Pro

	init paddle	creat model	2nd~10th avg
MobileNet	0.9/0.9	68/	218/
SSD	0.8/0.9	390/	6449/
VGG16

zhaofenqiang / PaddleOnACL

readme