alibaba / MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
http://www.mnn.zone/
8.67k stars 1.66k forks source link

reshape or flatten results error using opencl #211

Closed biwe closed 5 years ago

biwe commented 5 years ago

GPU: IMG PowerVR Series6XT GX6650 framework: caffe to mnn backend: opencl model: mobileNet ssd in cpu, the result is ok; in gpu, result error in flatten layer. I doubt that opencl do not convert date format from NC/4HW4 to NCHW. some layers feature map as fellow: CPU: afterCallBack op name: conv11_mbox_loc_perm ====== Tensor 0x106abdb0 ====== Dimension: 1, 19, 19, 12, Data: 0.335271, 0.592223, 0.998025, 1.395067, 1.427762, 1.088770, 0.240384, 0.792928, -8.728149, -7.821217, -6.197500, -8.131077, -5.805121, -3.239074, -1.806096, -5.083641, 1.422127, 1.273126, 1.950419, 2.709154, 0.377783, 0.322982, 0.193904, 0.613993, -9.883196, -8.627235, -9.351952, -10.933924, -4.705492, -2.748928... NC/4HW4 afterCallBack op name: conv11_mbox_loc_flat ====== Tensor 0x106abed0 ====== Dimension: 1, 4332, 1, 1, Data: 0.335271, 1.427762, -8.728149, -5.805121, 1.422127, 0.377783, -9.883196, -4.705492, 0.125618, 2.224391, -7.150221, -5.321198, 0.691812, 1.768733, -6.407306, -5.312942, 0.932490, 0.251554, -9.472221, -5.963003, 0.430472, 2.379925, -4.173745, -5.568319, 0.697052, 1.669839, -3.217705, -3.864530, 1.206202, 0.362562... NC/4HW4 GPU: afterCallBack op name: conv11_mbox_loc_perm ====== Tensor 0xbeffdb0 ====== Dimension: 1, 19, 19, 12, Data: 0.348389, 0.598633, 1.008789, 1.392578, 1.420898, 1.084961, 0.234741, 0.784668, -8.703125, -7.738281, -6.140625, -8.078125, -5.792969, -3.208984, -1.790039, -5.046875, 1.428711, 1.272461, 1.948242, 2.699219, 0.377441, 0.325928, 0.188354, 0.613770, -9.851562, -8.554688, -9.281250, -10.882812, -4.687500, -2.728516... NC/4HW4 afterCallBack op name: conv11_mbox_loc_flat ====== Tensor 0xbeffed0 ====== Dimension: 1, 4332, 1, 1, Data: 0.348389, 0.598633, 1.008789, 1.392578, 1.230469, 1.166016, 0.899414, 1.253906, 1.307617, 0.807129, 1.518555, 2.009766, 1.794922, 1.454102, 1.449219, 1.291992, 1.539062, 1.086914, 0.478271, 1.420898, 1.084961, 0.234741, 0.784668, 2.490234, 1.035156, 0.338867, 0.985840, -0.382080, -0.415771, 0.630859... NC/4HW4

how to solve this? thanks!

Zhuoyao1012 commented 4 years ago

I come to the same issue on a mali gpu. How do you solve it? @biwe