hszhao / PSPNet

Pyramid Scene Parsing Network, CVPR2017.
https://hszhao.github.io/projects/pspnet
Other
1.59k stars 545 forks source link

bin size in ppm #34

Open wtliao opened 7 years ago

wtliao commented 7 years ago

Hi,

I don't understand about the bin size of the pyramid pooling module (11, 22, 33, 66) in the paper. Does it mean that, for instance of bin size 3*3, the width and height of each feature map after pooling are both 3? If yes, each feature map is square? Thx.

hszhao commented 7 years ago

Yes, for the original design is trained with a square input(like 473*473), so in the ppm the pooled ones are all squared maps.

  1. Let's say your crop size of the input data is c, then it should be a number that can fit equation c = 8x+1; 2 Then your size in conv5_3 denotes as w = x + 1;
  2. In each pool level L(1,2,3,6), assume the kernel size is k, and stride is s, and k>=s, say k = s+a; In level 1, w = s+a; In level 2, w = 2s+a; In level 3, w = 3s+a; In level 6, w = 6s+a; So your s and k in level L should be s=[w/L], k=s+w%L. Also, you can modify the pool layer and interp layer to do automatic calculation.