pjreddie / darknet

Convolutional Neural Networks
http://pjreddie.com/darknet/
Other
25.82k stars 21.33k forks source link

batch size for forward propagation #529

Open HyunjunShin opened 6 years ago

HyunjunShin commented 6 years ago

"forward_convolutional_layer" function in "src/convolutional_layer.c" file

int m = l.n/l.groups;
int k = l.size*l.size*l.c/l.groups;
int n = l.out_w*l.out_h;
for(i = 0; i < l.batch; ++i){
    for(j = 0; j < l.groups; ++j){
        float *a = l.weights + j*l.nweights/l.groups;
        float *b = net.workspace;
        float *c = l.output + (i*l.groups + j)*n*m;
        im2col_cpu(net.input + (i*l.groups + j)*l.c/l.groups*l.h*l.w,
            l.c/l.groups, l.h, l.w, l.size, l.stride, l.pad, b);
        gemm(0,0,m,n,k,1,a,k,b,n,1,c,n);
    }
}

c (which is output data pointer) is considered about i (which is index of batch). But, there is no consideration about i for a (which is weight data pointer) and b (which is feature map data pointer). What I write above is logical error.

So, when I run inference with any network model which is configured with multiple size of batch, the result is wrong. Set batch size as 1 in *.cfg file. Then you can get the correct result. But I think we need to use multiple size of batch for training. So I wonder everything is okay if I erase code about i and batch in forward propagation function.

sivagnanamn commented 6 years ago

There's no error in this implementation.

Conv filter weights do not change based on batch size. Every conv layer will have only one set of filter weights stored in an float array. The number of weights in the array change based upon the size of the filters & number of filters.

Ex: Say we have a conv layer with 16 filters of size 3x3 & the input channels for that layer 3. Then the weights array will contains 16x3x3x3 float numbers. The same weights array will be convolved with all the input images in a batch.

a -> weights array
b -> im2col output array 
c -> output features array

If group convolution is used, j*l.nweights/l.groups acts as an offset index to select the respective group's filters.

HyunjunShin commented 6 years ago

I know what "group" means. hmm..... anyway there is a problem about bath size.....

DRACOyu commented 6 years ago

Hello, I use group convolution to achieve, training is no problem, but the output is wrong. This is the actual output coordinate box 0.000000-1058642304.000000-1.000000-0.000000 I would like to use the author's group convolution, I need to modify those, I only modified the two low-level functions, parse_convolutional and parse_softmax inside the groups parameter

DRACOyu commented 6 years ago

How to use the darknet groups,cudnn must >7?