doonny / PipeCNN

An OpenCL-based FPGA Accelerator for Convolutional Neural Networks
Apache License 2.0
1.22k stars 370 forks source link

How to handle data is not multiple of VEC_SIZE? #43

Closed aazz44ss closed 6 years ago

aazz44ss commented 6 years ago

Q1: How to handle data is not multiple of VEC_SIZE? take Alexnet for example: conv1 have 11x11x3 that can't divided by VEC_SIZE 4. so at mac operation, a1xb1+a2xb2+a3xb3+a4xb4, and at last one whouldn't have a3 and a4, is this will auto assign 0?

Q2: And It looks like data_vec is reading bottom linearly with size of VEC_SIZE ? in the pipeCNN paper that describe that weight is divided into size of VEC_SIZE at Z direction ex. I have weight 3x3x4, I should have VEC_SIZE(4) of group of weight at Z direction, each have 3x3x1=9 datas. 0, 1, 2, 3, ... , 8 9,10,11,12, ... 17 18,... 27,...

but in algorithm, you group data into data_vec linearly: {0, +1, +2 +3}, {+4, +5, +6, +7}, .... , ....., + 35 how this divided weight at Z direction?

for(unsigned short win_itm_z=0; win_itm_z<weight_dim3/VEC_SIZE; win_itm_z++){
    for(unsigned char  win_itm_y=0; win_itm_y<win_size_y; win_itm_y++){
        for(unsigned char  win_itm_x=0; win_itm_x<win_size_x; win_itm_x++){
            feature_idx_dim1 = win_itm_x;
            feature_idx_dim2 = win_itm_y;
            feature_idx_dim3 = win_itm_z;
            if(xy is at correct location){  
                data_vec = bottom[data_offset*data_dim1xdim2 + feature_idx_dim3*data_dim1xdim2 + (feature_idx_dim2-padding)*data_dim1 + (feature_idx_dim1-padding)];
            }
            else{
                #pragma unroll
                for(unsigned char vv=0; vv<VEC_SIZE; vv++){
                    data_vec.data[vv] = CZERO;
                }
            }   
            // start from using buffer[0]
            win_buffer[0][win_itm_z*win_size_y*win_size_x + win_itm_y*win_size_x + win_itm_x] = data_vec;
        }
    }
}
doonny commented 6 years ago

Hi , if the input channel is not multiples of VEC_SIZE, padding of zeros should be performed. The data has already been packed by the size of VEC_SIZE bytes in the Z direction in the host program.