hellochick / ICNet-tensorflow

TensorFlow-based implementation of "ICNet for Real-Time Semantic Segmentation on High-Resolution Images".
405 stars 153 forks source link

image size problems #46

Open ifangcheng opened 6 years ago

ifangcheng commented 6 years ago
  1. image size problems in training: When the image size in my own dataset is small (e.g. h=200, w=80), when I train the model, how should I set the INPUT_SIZE? INPUT_SIZE='720,720' or INPUT_SIZE='480,480' or it should be: INPUT_SIZE='200, 80' ?

  2. image size problems for inference:
    When I run inference.py with a smaller size input image (e.g. h=212, w=87), the image is padding to 224,96, then some error came up: something like "stride must be >0 got 0 for conv5_3_pool6 ..." However, if I try inference image with bigger size (e.g.360,480), everything works well. So, is there still any limits for the input image size? can not support any arbitrary image size?

hellochick commented 6 years ago

Hey @ifangcheng, I think the problem occurs at model.py from line468-482:

        (self.feed('conv5_3/relu')
             .avg_pool(h, w, h, w, name='conv5_3_pool1')
             .resize_bilinear(shape, name='conv5_3_pool1_interp'))

        (self.feed('conv5_3/relu')
             .avg_pool(h/2, w/2, h/2, w/2, name='conv5_3_pool2')
             .resize_bilinear(shape, name='conv5_3_pool2_interp'))

        (self.feed('conv5_3/relu')
             .avg_pool(h/3, w/3, h/3, w/3, name='conv5_3_pool3')
             .resize_bilinear(shape, name='conv5_3_pool3_interp'))

        (self.feed('conv5_3/relu')
             .avg_pool(h/4, w/4, h/4, w/4, name='conv5_3_pool6')
             .resize_bilinear(shape, name='conv5_3_pool6_interp'))

So, the default minimum size for input images is: output strides 32 * pooling strides 4 = 128 But you can specify these values to support smaller images, for examples: change .avg_pool(h/4, w/4, h/4, w/4, name='conv5_3_pool6') into .avg_pool(3, 3, 3, 3, name='conv5_3_pool6')