About the input size of image in demo.py

feitiandemiaomi commented 5 years ago

Hi, Thanks your great repo. I successed to run demo.py according to README, but when I try to other images ,it maked error as follows: _File "demo.py", line 160, in demo() File "demo.py", line 52, in demo outputs = model(image) File "/home/data/anaconda3/envs/caffe2_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, *kwargs) File "/home/data/code/Fast-SCNN-pytorch/models/fast_scnn.py", line 37, in forward x = self.feature_fusion(higher_res_features, x) File "/home/data/anaconda3/envs/caffe2_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(input, *kwargs) File "/home/data/code/Fast-SCNN-pytorch/models/fast_scnn.py", line 215, in forward out = higher_res_feature + lower_resfeature RuntimeError: The size of tensor a (135) must match the size of tensor b (136) at non-singleton dimension 2 the image size is 10801920, should I resize the fixed size?

Tramac commented 5 years ago

Probably because the shapes of higher_res_feature and lower_res_feature are different.

feitiandemiaomi commented 5 years ago

是的，我输出了两个tensor大小，一个[1, 64, 135, 240] [1, 128, 136, 240]，似乎其中一个在下采样中发生了变化

Tramac commented 5 years ago

Try to change the upsampling method.

feitiandemiaomi commented 5 years ago

I'm sorry ,I found the image size must be 2:1 (width:height) and do not change the upsampling.

priancho commented 5 years ago

Hi @feitiandemiaomi ,

I am using COCO dataset and faced the same issue. My solution is to pad 0 values to W and H so that W' and H' are divisible by 32, because Fast SCNN reduces input size WxH in half for 5 times and W' (and H') which are divisible by 32 will not cause dimension mismatch after upsampling.

The following is modification to eval.py :-)

    def eval(self):
        self.model.eval()
        for i, (image, label) in enumerate(self.val_loader):
            # pad input image so W and H are divisible by 32 (see paper)
            # image: NCHW order
            # padding: WW'HH'CC'NN'
            image_shape = image.shape
            image_paddings = [0, 0, 0, 0, 0, 0, 0, 0]
            ## padding H
            if image_shape[2] % 32 != 0:
                image_paddings[3] = 32 - (image_shape[2] % 32)
            ## padding W
            if image_shape[3] % 32 != 0:
                image_paddings[1] = 32 - (image_shape[3] % 32)
            image = F.pad(image, image_paddings)

            # predict seg
            image = image.to(self.args.device)
            outputs = self.model(image)

            # remove padding area
            pred = torch.argmax(outputs[0], 1)
            pred = pred.cpu().data.numpy()
            pred = pred[:, :image_shape[2], :image_shape[3]]
            label = label.numpy()

            # update evaluation metric
            self.metric.update(pred, label)
            pixAcc, mIoU = self.metric.get()
            print('Sample %d, validation pixAcc: %.3f%%, mIoU: %.3f%%' % (i + 1, pixAcc * 100, mIoU * 100))

            # save the result
            predict = pred.squeeze(0)
            mask = get_color_pallete(predict, self.args.dataset)

            # must shuffle = False
            mask.save(os.path.join(self.outdir, 'seg_{}.png'.format(i)))

feitiandemiaomi commented 5 years ago

@priancho How smartly you do, Iet me try

Tramac / Fast-SCNN-pytorch

About the input size of image in demo.py #9