Trying to reproduce your results

uyekt commented 5 years ago

Hi there,

Thank you very much for releasing this code!

I'm trying to reproduce your results. However, I guess I'm missing something...

Training config is:
{
    "batch_size": 64,
    "ckpt_dir": "checkpoint/carn_baseline",
    "ckpt_name": "carn_baseline",
    "clip": 10.0,
    "decay": 400000,
    "group": 1,
    "loss_fn": "L1",
    "lr": 0.0001,
    "max_steps": 600000,
    "model": "carn",
    "num_gpu": 1,
    "patch_size": 64,
    "print_interval": 1000,
    "sample_dir": "sample/",
    "scale": 0,
    "shave": 20,
    "train_data_path": "dataset/DIV2K_train.h5",
    "verbose": true
}
# of params: 1591963

On DIV2K Bicubic. Did you use bicubic Downscaling or unknown dowgrading operators?

After 575k Iteration on one single Titan X, I could only achieve following results on Urban100:

x2: 30.31
x3: 26.52
x4: 24.57 which is kind of far from the paper results :-(

Is it just bad luck with the initialization or do I miss something important?

Btw, I noticed that I can fit Batch 64 / Patch 64 on one single Titan X. When I use two, the second one loads only about 600Mb Memory. Is that a normal behavior?

Thanks a lot for your help!

nmhkahn commented 5 years ago

Hi. Did you run the benchmark on the python? Normally, it has to be calculated on the Matlab. First, I guess in this code, evaluation is performed using RGB channels, but it's common to calculate PSNR just using Y (luminance) channel in SR community. Second, Not sure why, but converting RGB to Y channel results different compared to the python and Matlab. So, if your result scores are calculated during training, please see it just a "validation" score.

uyekt commented 5 years ago

Oh thanks, I missed the comment in the training code. I was able to almost reproduce your results converting RGB to Y. Do you crop any border-pixel on Y before calculating PNSR/SSIM?

nmhkahn commented 5 years ago

I refer the eval code on SelfExSR repo, which crops the scale_factor border pixels.

uyekt commented 5 years ago

Perfect! Thanks a lot!

Auth0rM0rgan commented 5 years ago

Hey @nmhkahn, Thanks for sharing code. I'm trying to reproduce your results so I want to know how many steps did you train the network to achieve the results and after how many steps did you start to decay?

Thanks in advanced!

nmhkahn commented 5 years ago

@Auth0rM0rgan We trained 600k steps and decay at every 400k steps, just like the example running code in the README.

Auth0rM0rgan commented 5 years ago

Hey @nmhkahn, Thanks for the reply. Just one question about the CARN checkpoint you provided. Is it the last checkpoint of your model after 600K or the best one?

Thanks!

nmhkahn commented 5 years ago

We trained a few models 600K entirely and pick the best one, but you can pick the best steps if you train a single model.

zymize commented 4 years ago

@nmhkahn hi, I just got touched super-resolution,I don't know how to test or train it converting RGB to Y. Do I need modify this codes to Y channels to train and test it ? Or using the checkpoint of RGB channels converts Y channels to test on matlab ? I never test it on Matlab,always tensorflow or pytorch,how to operation it ? looking forward to your reply.thanks.

nmhkahn commented 4 years ago

@zymize Hi. First of all, my code works on the RGB channel and the network produces RGB images. And if you want to benchmark on the [Set5, Set14, B100, Urban100], 1) get the RGB images using CARN 2) convert it to Y channel image (gray-scale image) 3) test with MATLAB. Many MATLAB version of evaluation codes performs 2 and 3 steps simultaneously such as this.

0asa commented 1 year ago

I'm also trying to reproduce the results. I notice a difference between matlab and python psnr implementation.

Using the matlab code:

pkg load image
im1 = imread('/datasets/Set5/image_SRF_4/img_001_SRF_4_bicubic.png');
im2 = imread('/datasets/Set5/image_SRF_4/img_001_SRF_4_HR.png');
d = compute_difference(im1, im2, 4)

returns d = 31.771

Using python:

from piq import psnr
from torchvision.io import read_image
from torchvision.transforms import RandomCrop, Resize, GaussianBlur, Compose, Normalize, CenterCrop
from torchmetrics import PeakSignalNoiseRatio
import torch
import torch.nn as nn
from torch import Tensor
from torch.nn.functional import mse_loss as mse
from color import rgb_to_ycbcr

im1 = read_image('/datasets/Set5/image_SRF_4/img_001_SRF_4_bicubic.png')
im2 = read_image('/datasets/Set5/image_SRF_4/img_001_SRF_4_HR.png')

border = 4*2
border_removal = Compose([
    CenterCrop((int(im1.shape[1]-border), int(im1.shape[2]-border))),
])

im1 = border_removal(im1)
im2 = border_removal(im2)

# using piq module
p = psnr(im1.float().unsqueeze(0), im2.float().unsqueeze(0), data_range=255., convert_to_greyscale=True, reduction='mean')
print(p)

# using torchmetrics module
psnr2 = PeakSignalNoiseRatio()
p = psnr2(rgb_to_ycbcr(im1.float())[0,:,:], rgb_to_ycbcr(im2.float())[0,:,:])
print(p)

# "manually" computing psnr
max_val = 255.0
print(10.0 * torch.log10(max_val**2 / mse(rgb_to_ycbcr(im1.float())[0,:,:], rgb_to_ycbcr(im2.float())[0,:,:], reduction='mean')))

gives:

tensor(30.5169)
tensor(30.5169)
tensor(30.5169)

Not sure where the difference comes from... maybe the RGB to YCbCr is different?

nmhkahn / CARN-pytorch

Trying to reproduce your results #6