PaddlePaddle / Anakin

High performance Cross-platform Inference-engine, you could run Anakin on x86-cpu,arm, nv-gpu, amd-gpu,bitmain and cambricon devices.
https://anakin.baidu.com/
Apache License 2.0
530 stars 135 forks source link

How to use api of direct_conv_xxx in sass_func.hpp #525

Open peyer opened 5 years ago

peyer commented 5 years ago

For convenience, I just take an instance below. If the shape of input tensor of conv op is [1, 32, 110, 94] ([N, C, H, W] order), the shape of kernel of conv op is 33([H, W] order), the channels of kernel is 64, the stride of kernel is 11([H, W] order), the pad of kernel is 00, the dilation of kernel is 11, the group of conv op is 1, then how to set parameters below img_in_channel_stride img_in_height_stride img_in_width_stride img_out_channel_stride img_out_height_stride img_out_width_stride

xyoungli commented 5 years ago

For convenience, I just take an instance below. If the shape of input tensor of conv op is [1, 32, 110, 94] ([N, C, H, W] order), the shape of kernel of conv op is 3_3([H, W] order), the channels of kernel is 64, the stride of kernel is 1_1([H, W] order), the pad of kernel is 0_0, the dilation of kernel is 1_1, the group of conv op is 1, then how to set parameters below img_in_channel_stride img_in_height_stride img_in_width_stride img_out_channel_stride img_out_height_stride img_out_width_stride

if tensor is continous, then: channel_stride = w * h height_stride = w width_stride = 1

peyer commented 5 years ago

@xyoungli I changed the value of parameter according your advice, but I still got different result compared with cudnn, could you take some time to check my calling code, I have sent it to you on email.Thanks a lot!