MPI-IS / bilateralNN

Learning Sparse High Dimensional Filters with Neural Networks
http://bilateralnn.is.tue.mpg.de
BSD 3-Clause "New" or "Revised" License
69 stars 25 forks source link

BNN layers used alongside with downsampling layers not working properly . #9

Closed codepujan closed 7 years ago

codepujan commented 7 years ago

Hello , I tried to make a simple stack of Bilateral Neural Network using pooling layer to separate two stacks. The architecture is very simple permutohedral_1 -> permutohedral_2 -> pooling -> permutohedral_3

The Input of permutohedral_3 layer is the pooling layer just before it (along with the pooling of bilateral features ) .

...permutohedral 1 ...permutohedral 2

Now , the Layer where the problem seems to be coming :

layer { name: "pool1" type: "Pooling" bottom: "permutohedral_2" top: "pool1" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer{ name:"bilateral_pool" type:"Pooling" bottom:"bilateral_features" top:"bilateral_pool" pooling_param{ pool:MAX kernel_size:2 stride:2 } } layer { name: "permutohedral3" type: "Permutohedral" bottom:"pooling" bottom:"bilateral_pool" #comes from the Pooling of Initial bilateral_features layer bottom:"bilateral_features" ....

I have Listed the Top Shape of the given layers :

Top : permutohedral_1 -> 1 32 375 500 Top : permutohedral_2 -> 1 32 375 500 Top : pooling -> 1 32 188 250 Top : bilateral_pool -> 1 5 188 250 Top : permutohedral_3 -> 1 64 375 500

The Inputs to the permutohedral_3 layers are [1 32 188 250] and [1 5 188 250] . But, the output shape of permutohedral layer still has the double of its size [1 128 375 500] . Which seems to be like it's not processing the downsampling action , taking it into the network and actually processing the whole image.

Any guidance to solve this problem so that the output of the permutohedral_3 can be shaped sth like [1 64 188 375 500] would be great . !!

Thanks .

varunjampani commented 7 years ago

The output of the permutohedral layer will have the same size as 'bottom3' (output features). Probably, you want to pass the 'bilateral_pool' as third bottom as well, if you want to get low-resolution output.

To be more clear, the first bottom (input) and second bottom (input features) for the permutohedral layer should have same spatial dimensions. The third bottom (output features) indicates the dimensionality of the layer output (filtered result).