Closed se7enXF closed 4 years ago
OK! I find out the answer.
You just split the feature map into different scales tiles using global pooling to implement PSP Pooling. It has the same function with fixed scales feature map using different size pooling.
Hi @se7enXF , apologies for late reply I just saw this. You can find a simple explanation of the PSPPooling in psp_pooling_understanding_nonHybrid.py file.
In mxnet, the dimensions are (Batch, NChannels, Height, Width). The input is subsampled (in H,W) space with max pooling in scales per dimension 1/1, 1/2, 1/4 etc (this pooling leaves the number of channels invariant),
x = F.Pooling(_input,kernel=[pool_size,pool_size],stride=[pool_size,pool_size],pool_type='max')
Then the result is upsampled in the original resolution:
x = F.UpSampling(x,sample_type='nearest',scale=pool_size)
and then this output is passed through a convolution layer that reduces the number of initial channels to 1/4 of their initial number.
Then these 4 outputs are concatenated with the initial input resulting in twice as many channels as the initial number of filters:
out = F.concat(p[0],p[1],p[2],p[3],p[4],dim=1)
Finally, a last convolution brings the total number of channels equal to initial number of channels.
The actual implementation in mxnet is a bit more involved in order to get the hybridization feature, but the philosophy is the same.
Hope this helps.
I want to implement your work in tensorflow but get a problem.In your paper, 'the initial input is split in channel (feature) space in 4 equal partitions', is the description of initial input for each branch of PSP Pooling. I found that it is different from PSPNet. I do not know why you split the feature map but follow the idea. Here comes the problem.
Let a feature map shape is F(batch, width, height, channel). I think your idea is to split feature map like F[b, w, h, c/4], but different from your code.
If I am right that data in mxnet is in shape (batch, channel, width, height), I question that why you split feature map in width and height or I miss something?
Sorry to trouble you and thanks for your work!