Two input blobs (features from two different images generated by conv and pooling) are used to compute this feature. A 5x5 neighbourhood of one pixel (f) channel (for each pixel) is subtracted from the corresponding channel from the other blob (g). To subtract the 5x5 pixels form f the corresponding g pixel value is repeated in a 5x5 patch. This outputs a 5x5 patch for each input pixel. Thereafter the role of f and g are reversed and the operation is repeated and concatenated in the channel dimension. Hence for two WxHxC images the output is of the size 5Wx5Hx2C, where W, H and C are width, height and channels respectively.
For simplicity the following is the corresponding matlab code for the operation
[inputs : f, g each of which are of size MxMxC, for simplicity C=1]
image_size=size(f);
kernel_size=5;
pad_size=floor((kernel_size-1)/2);
G=padarray(g,[pad_size,pad_size],'replicate');
F=padarray(f,[pad_size,pad_size],'replicate');
%%%%%%%%%%%%%%%%%%%%%%%%%%
%im2col to get neighbouring kernels into a matrix
%%%%%%%%%%%%%%%%%%%%%%%%%%
While this is straigtforward in matlab, we were wondering how to implement the repmat in caffe. Furthermore since the layer has no parameters, it will not have any diffs. However for the backward pass how do we propagate the gradients from the layers that follow to the layers that preceed it. This layer has proven to be very useful for person reidentification and it might be used in future for new face verification networks as well.
Hi All, We are trying to implement the Cross Input Neighbourhood difference layer as described in http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Ahmed_An_Improved_Deep_2015_CVPR_paper.pdf in caffe.
Two input blobs (features from two different images generated by conv and pooling) are used to compute this feature. A 5x5 neighbourhood of one pixel (f) channel (for each pixel) is subtracted from the corresponding channel from the other blob (g). To subtract the 5x5 pixels form f the corresponding g pixel value is repeated in a 5x5 patch. This outputs a 5x5 patch for each input pixel. Thereafter the role of f and g are reversed and the operation is repeated and concatenated in the channel dimension. Hence for two WxHxC images the output is of the size 5Wx5Hx2C, where W, H and C are width, height and channels respectively.
For simplicity the following is the corresponding matlab code for the operation [inputs : f, g each of which are of size MxMxC, for simplicity C=1] image_size=size(f); kernel_size=5; pad_size=floor((kernel_size-1)/2);
G=padarray(g,[pad_size,pad_size],'replicate'); F=padarray(f,[pad_size,pad_size],'replicate'); %%%%%%%%%%%%%%%%%%%%%%%%%% %im2col to get neighbouring kernels into a matrix %%%%%%%%%%%%%%%%%%%%%%%%%%
fcol=im2col(F,[kernel_size,kernel_size],'sliding'); gcol=im2col(G,[kernel_size,kernel_size],'sliding');
%%%%%%%%%%%%%%%%%%%%%%%%%% %Each pixel is repeated to the size of the Kernel %%%%%%%%%%%%%%%%%%%%%%%%%%
fres=repmat(reshape(f,[1,image_size_image_size]),[kernel_size_kernel_size,1]); gres=repmat(reshape(g,[1,image_size_image_size]),[kernel_size_kernel_size,1]);
%%%%%%%%%%%%%%%%%%%% %Subtract g neighbourhoods from f %%%%%%%%%%%%%%%%%%%%
c1=fres-gcol; Res2(:,:,1)=col2im(c1,[kernel_size,kernel_size],[image_size_kernel_size,image_size_kernel_size],'distinct');
%%%%%%%%%%%%%%%%%%%% %Subtract f neighbourhoods from g %%%%%%%%%%%%%%%%%%%%
c2=gres-fcol; Res2(:,:,2)=col2im(c2,[kernel_size,kernel_size],[image_size_kernel_size,image_size_kernel_size],'distinct');
%Res2 is the output
While this is straigtforward in matlab, we were wondering how to implement the repmat in caffe. Furthermore since the layer has no parameters, it will not have any diffs. However for the backward pass how do we propagate the gradients from the layers that follow to the layers that preceed it. This layer has proven to be very useful for person reidentification and it might be used in future for new face verification networks as well.