csmliu / AdaNEC

22 stars 1 forks source link

About the PSNR of wild(55) dataset during testing #1

Open Rachel-kk opened 2 years ago

Rachel-kk commented 2 years ago

Hi,I tested your released model in wild(55) dataset, but it got psnr of 25.09, ssim of 0.889. I don't understand why the psnr I tested is not the same as your paper(the psnr of 25.26, the ssim of 0.890 in your paper).

Do you have any other processing of the dataset?

Rachel-kk commented 2 years ago

Also, it is mentioned in your paper: "Instead, suppose that there are N training datasets, where the i-th dataset is denoted by Si, we train an SIRR model for each of the dataset."

When ERRNet is used as backbone network, how do you train on real90 and unaligned datasets?

CastellanLiu commented 2 years ago

Thanks for your interest in our work.

1) Because we use the domain-level expert weights (i.e., we average the $w_i$ for all samples in that test set), and you should change the --avg parameter for testing on other datasets (see options/base_options.py). You can print the weights (the "attn_" here) and calculate the average value by yourself (it should be the same as the pre-defined ones in our commented options). Sorry for the inconvenience.

2) ERRNet is a backbone architecture, we train three models (with ERRNet architecture) by i) initializing the model with the ERRNet official weights, and ii) fine-tuning them with these datasets (e.g., real90, unaligned).

Besides, there is a mistake in our code, i.e., the number of images in Wild (see here). It should be 55, not 50. However, it only influences the final average metrics calculated by this code. Fortunately, the values in the paper are correct since we only recorded the metrics for each dataset and calculated the final average metrics in Excel.

Rachel-kk commented 2 years ago

Hi, thanks for your reply.

  1. According to your method, I change the --avg parameter for testing on my local datasets. However, I got psnr on wild and solid dataset which is not the same as your paper. The result is as follows:
Real20 (20) Wild (55) Postcard (199) Solid (200)
your paper 22.80 / 0.790 25.26 / 0.890 23.08 / 0.874 25.26 / 0.904
our test 22.80 / 0.790 25.16 / 0.889 23.08 / 0.874 25.17 / 0.899

Can you provide your test dataset?

CastellanLiu commented 2 years ago

Hi, fortunately, we can get access to the SIR dataset without request in advance now. Thus I uploaded our test sets to the Baidu net disk (https://pan.baidu.com/s/10BOHUyZGYt0c-9YqP7pVDw?pwd=kka2). Note that we have converted the .jpg files to .png files. Please have a try again and see whether it is because of the difference between the datasets.

Rachel-kk commented 2 years ago

Hi,Thank you so much!

I retested on your test datasets, and I got the same performance as your paper. However, I'm still a bit curious why have you converted the .jpg files to .png files?

CastellanLiu commented 2 years ago

Actually, the evaluation dataset I provided was prepared by my colleague (please refer to https://github.com/liyucs/RAGNet). I just use these datasets directly.

Rachel-kk commented 2 years ago

ok, thanks

I am going to reproduce your paper...

Rachel-kk commented 2 years ago

Hi,

When training an SIRR model (with ERRNet architecture) for each of the dataset (e.g., Real90, Unaligned, SynCEIL), I ran this command.

python train_errnet.py \
  --name errnet_real90  \
  --hyper \
  -r \
  --icnn_path ./checkpoints/errnet_060_00463920.pt \
               ./checkpoints/errnet_060_00463920.pt \
               ./checkpoints/errnet_060_00463920.pt \
  --nModel 3

After training, I ran this command to test.

python test_errnet.py \
  --name errnet_real90  \
  --hyper \
  -r \
  --icnn_path ./checkpoints/errnet_real90/errnet_latest.pt \
  --nModel 3

Is the above correct?

Also, when training expert model for SynCEIL, did you crop the center region with size 224 x 224 for VOC2012?

CastellanLiu commented 2 years ago

Sorry for the late response.

Actually, you should train the domain experts via the official code of ERRNet. I mean, train each of the domain experts separately. Then they are used as domain experts and trained with my code modified from ERRNet.

For cropping, I follow the official code of ERRNet (see https://github.com/csmliu/AdaNEC/blob/master/data/reflect_dataset.py#L62)

Rachel-kk commented 2 years ago
Hi, I ran three experiments by changing the training data. Unfortunately, none of the experiments reproduced the performance of your paper. The results are obtained using the testing dataset you provided, as follows: Real(20) Wild(55) Postcard(199) Solid(200)
paper 22.80/0.790 25.26/0.890 23.08/0.874 25.26/0.904
Exp_1 23.04/0.792 24.88/0.877 22.75/0.873 24.95/0.898
Exp_2 23.00/0.790 25.11/0.887 22.21/0.869 24.79/0.896
Exp_3 22.94/0.790 25.22/0.886 22.33/0.877 24.35/0.891

Exp_1:

  1. Firstly, I trained the domain experts via the official code of ERRNet for three datasets separately.

    • Synthetic dataset: Before training, I used the center crop to 224x224 on the original VOC2012.
    • Real89 dataset: the Height of Real89 images is resized to 512 provided by the authors of the ERRNet. Then, it is cropped to 224x224 on the fly.
    • Unaligned250 dataset: It is cropped to 224x224 on the fly.
  2. Then they are used as domain experts and trained with your code, the trained weights for the testing data are as follows:

    • Real20: [0.092550091, 0.063934185, 0.843515721]
    • wild: [0.083660906, 0.230148451, 0.686190650]
    • postcard: [0.001090158, 0.645903816, 0.353006029]
    • solid: [0.283300335, 0.388968509, 0.327731159]

Exp_2

  1. Firstly, I trained the domain experts via the official code of ERRNet for three datasets separately.

    • Synthetic dataset: Before training, I cropped to 224x224 on the original VOV2012 by reflect_dataset.py.
    • Real89 dataset: the same as Exp_1
    • Unaligned250 dataset: the same as Exp_1
  2. Then they are used as domain experts and trained with your code, the trained weights for the testing data are as follows:

    • Real20: [0.148800396, 0.044181681, 0.807017929]
    • wild: [0.223979196, 0.217365322, 0.558655489]
    • postcard: [0.061034970, 0.300128878, 0.638836153]
    • solid: [0.558580633, 0.181719502, 0.259699868]

Exp_3

  1. Firstly, I trained the domain experts via the official code of ERRNet for three datasets separately.

    • Synthetic dataset: The same as Exp_2
    • Real89 dataset: The shorter side of Real89 images is resized to 512
    • Unaligned250 dataset: The same as Exp_1
  2. Then they are used as domain experts and trained with your code, the trained weights for the testing data are as follows:

    • Real20: [0.079918947, 0.007921040, 0.912160021]
    • wild: [0.100883266, 0.125001486, 0.774115256]
    • postcard: [0.165105508, 0.211451088, 0.623443398]
    • solid: [0.433878908, 0.161532401, 0.404588691]

By the way, I trained the model on TITAN V Pytorch1.7.1+cuda10.1.

CastellanLiu commented 2 years ago

I would like to check several things below: 1) Did you initialize the model when training the domain experts? 2) If yes, which model did you use to initialize the model? 3) For ERRNet, the performance of the domain experts has been provided in Table 4. Before checking the final results, could you check your domain experts? 4) Are the domain expert parameters fixed during training? Actually, the training code in this repo is not well organized, I will release a better version later (but not in the recent days since I am busy recently).

Rachel-kk commented 2 years ago
  1. Yes, I used the same model( ERRNet official weights) for initialization training all the domain experts. And each dataset is iterated 20 times.
  2. No, I didn't check my domain experts before checking the final results. Now, I test my domain experts, and the model performance metrics for my three experiments are as follows. image
  3. I directly ran your training code using the domain experts without any modifications. I guess the domain expert parameters should be fixed.
CastellanLiu commented 2 years ago

Sorry, I used the model I finetuned with train_errnet_unalign.py for initialization. As mentioned in the readme, the training code is not exactly the one I used, it should work fine, but I'm not very sure about it.

Rachel-kk commented 2 years ago

Thank you for telling me these, And I will retrain the initial model in this way.

Looking forward to your new code!