Maratyszcza / NNPACK

Acceleration package for neural networks on multi-core CPUs
BSD 2-Clause "Simplified" License
1.67k stars 316 forks source link

NNPACK gave a totally wrong result on Android :| #140

Closed daquexian closed 6 years ago

daquexian commented 6 years ago

The model is a faster-rcnn-MobileNetV2-FPN model which contains many depthwise convolutions, what's more I modify the convolution in "detection head" of faster-rcnn to depthwise convolution. I'm using caffe2 with detectron ops so that I can run FPN models on Android. :)

Without NNPACK:

.......
I0508 14:26:59.701576 23290 operator.cc:167] Engine NNPACK is not available for operator Conv.
I0508 14:27:02.524704 23290 test_caffe2.cpp:99] Running time: 2788.48
I0508 14:27:02.524919 23290 test_caffe2.cpp:106] 5 objects
I0508 14:27:02.524966 23290 test_caffe2.cpp:109] class: 1
I0508 14:27:02.525017 23290 test_caffe2.cpp:110] score: 0.970812
I0508 14:27:02.525063 23290 test_caffe2.cpp:111] bbox: 503.247, 271.942, 532.193, 352.145
I0508 14:27:02.525126 23290 test_caffe2.cpp:109] class: 1
I0508 14:27:02.525172 23290 test_caffe2.cpp:110] score: 0.960696
I0508 14:27:02.525218 23290 test_caffe2.cpp:111] bbox: 636.642, 279.933, 665.662, 343.2
I0508 14:27:02.525274 23290 test_caffe2.cpp:109] class: 1
I0508 14:27:02.525319 23290 test_caffe2.cpp:110] score: 0.0725886
I0508 14:27:02.525365 23290 test_caffe2.cpp:111] bbox: 93.1686, 150.505, 1685, 780.773
I0508 14:27:02.525420 23290 test_caffe2.cpp:109] class: 3
I0508 14:27:02.525467 23290 test_caffe2.cpp:110] score: 0.530345
I0508 14:27:02.525514 23290 test_caffe2.cpp:111] bbox: 515.964, 274.124, 603.537, 333.57
I0508 14:27:02.525569 23290 test_caffe2.cpp:109] class: 8
I0508 14:27:02.525615 23290 test_caffe2.cpp:110] score: 0.228379
I0508 14:27:02.525661 23290 test_caffe2.cpp:111] bbox: 510.722, 271.072, 604.24, 335.786

With NNPACK:

I0508 14:19:37.618585 22141 test_caffe2.cpp:99] Running time: 2349.08
I0508 14:19:37.618890 22141 test_caffe2.cpp:106] 4 objects
I0508 14:19:37.618935 22141 test_caffe2.cpp:109] class: 1
I0508 14:19:37.618983 22141 test_caffe2.cpp:110] score: 0.18578
I0508 14:19:37.619029 22141 test_caffe2.cpp:111] bbox: 639.026, 277.556, 664.019, 346.954
I0508 14:19:37.619091 22141 test_caffe2.cpp:109] class: 1
I0508 14:19:37.619134 22141 test_caffe2.cpp:110] score: 0.185452
I0508 14:19:37.619179 22141 test_caffe2.cpp:111] bbox: 503.663, 262.692, 527.753, 353.602
I0508 14:19:37.619233 22141 test_caffe2.cpp:109] class: 1
I0508 14:19:37.619277 22141 test_caffe2.cpp:110] score: 0.078429
I0508 14:19:37.619320 22141 test_caffe2.cpp:111] bbox: 0, 262.418, 1206.26, 920.9
I0508 14:19:37.619379 22141 test_caffe2.cpp:109] class: 1
I0508 14:19:37.619421 22141 test_caffe2.cpp:110] score: 0.0684222
I0508 14:19:37.619465 22141 test_caffe2.cpp:111] bbox: 19.5151, 0, 1685, 686.885

The result without NNPACK is right, for it is the same with that producing by GPU.

I didn't modify the code when switching between NNPACK and non-NNPACK, and the faster-rcnn-ResNet-50-FPN model works correctly with NNPACK.

My phone is Google Pixel (arm64-v8a) with Android 8.1. I can provide a minimal project and my model if you need.

BTW, I don't know how to use NNPACK with Caffe2 on PC. So I haven't tested it on PC :)

Maratyszcza commented 6 years ago

Please try these steps changes:

  1. Update NNPACK and cpuinfo submodules in Caffe2 (cd third-party/NNPACK && git pull origin master, cd third-party/cpuinfo && git pull origin master).
  2. If it doesn't help, run your binary with --caffe2_profile_nnpack true and post the results of the convolution.
  3. It would also help if you could show the text representation of the protobuf file for the prediction network (note: not init network). You can get it with protoc --decode caffe2.NetDef /path/to/caffe2/proto/caffe2.proto -I /path/to/caffe2/proto < predict_net.pb > predict_net.pbtxt
daquexian commented 6 years ago

@Maratyszcza Updating NNPACK and cpuinfo to latest solves this problem, Thanks!

Maratyszcza commented 6 years ago

Thanks, this is helpful. I'll update the submodules in pytorch/pytorch repo to make sure this problem doesn't reoccur.