How does your model process test images with arbitrary size?

sg-nm / Operation-wise-attention-network

Attention-based Adaptive Selection of Operations for Image Restoration in the Presence of Unknown Combined Distortions (CVPR 2019)

MIT License

95 stars 16 forks source link

How does your model process test images with arbitrary size? #1

Closed pengsongyou closed 5 years ago

pengsongyou commented 5 years ago

Hi,

First of all, thanks so much for your amazing work! I am trying your code and have a question: In the current test code, you are still dealing with images with the size of 63x63, just the same as the training images. However, in your examples of object detection, the images are apparently on their original size, and the size is different from each image. How are you dealing with this situation? Do you directly input such image and get its output, or do you resize it first and input, and resize back the output image?

Thanks for your help in advance!

sg-nm commented 5 years ago

Thank you so much for testing our code. I'm sorry I did not make it clear in our paper. For object detection, we used the images with their original size and directly input these images into a CNN. Also, the CNN for image restoration is trained only on the DIV2K dataset described in Section 4.2.1 in our paper (i.e. image size is 63x63), we did not train the CNN on the VOC dataset.

pengsongyou commented 5 years ago

Thanks so much for your reply! May I ask one more quick question?

While training your network, The testing PSNR/SSIM reaches 70.70/0.9562 after only 5 epochs on the moderate set. This is weird. Is there anything wrong?

sg-nm commented 5 years ago

Did you check an input image which is given to a CNN? If all pixels of the input image is black, you need to modify the function in data_load_mix.py file like the follwing:

def __getitem__(self, index):
        img = self.data[index]
        img_gt = self.label[index]
        # transforms (numpy -> Tensor)
        if self.transform is not None:
            img = self.transform(img*255) #change
        if self.target_transform is not None:
            img_gt = self.target_transform(img_gt*255) #change
        return img, img_gt

pengsongyou commented 5 years ago

Yes, I have checked the input/target/output images and they are all ok without black (the range of them are around 0~1). I also tried what you suggested to multiply 255, also accordingly modified the calculating of SSIM/PSNR code (here), including change the range to 255, data_range of SSIM, PSNR calculation. However, the SSIM/PSNR are still 0.98/95 after the first 1 epoch...

sg-nm commented 5 years ago

Thank you for your reply and for testing. Which dataset are you using now? I will check again.

pengsongyou commented 5 years ago

I am using the RL-Restore dataset (mix) for training and testing. So basically I followed exactly what you suggested in readme to acquire the training/testing dataset of RL-Restore. And I train with: python main.py -m mix -g 8

sg-nm commented 5 years ago

I tested the code, the PSNR/SSIM are 24.92/0.6224 after the 1st epoch and the output images look fine. I have attached my requirements.txt file, could you please check it? Also, when we calculate PSNR/SSIM values during the test phase, we need to set the test batchsize to 1.

pengsongyou commented 5 years ago

Hi, I figure out the problem. Since the range of the output of your network is not [0, 1] but [0, 255], the testing code here should be modified accordingly.

    output = output.data.cpu().numpy()[0]
    output[output>255] = 255.0
    output[output<0] = 0.0
    output = output.transpose((1,2,0))
    hr_patch = hr_patch.data.cpu().numpy()[0]
    hr_patch[hr_patch>255] = 255.0
    hr_patch[hr_patch<0] = 0.0
    hr_patch = hr_patch.transpose((1,2,0))
    # SSIM
    test_ssim+= ski_ssim(output, hr_patch, data_range=255, multichannel=True)
    # PSNR
    imdf = (output - hr_patch) ** 2
    mse = np.mean(imdf) + eps
    test_psnr+= 10 * math.log10(255.0**2/mse)
    test_ite += 1

With this modification and your provided model_best.pth, the PSNR / SSIM is 27.10 / 0.68, which is quite consistent with what you reported in the paper!