csjliang / DASR

Official implementation of the paper 'Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution' in ECCV 2022
Apache License 2.0
128 stars 9 forks source link

Questions about pretrained MSRResNet #8

Closed orchidmalevolence closed 2 years ago

orchidmalevolence commented 2 years ago

Thx for sharing codes! I carefully studied your codes but didnt find the pretrained MSRResNet model (Not trained DASR model). Could you provide a link for it? Also very interested in the training yml of MSRResnet, great thx if you could update it!

Some minor ques:

  1. I applied similar idea about degradation sub-space and predictor in my sr model, but found it really hard to train a good predictor, the avg L1 regression loss stays around 0.25 (which means the predictor only output a random embedding I think) and stops decreasing. Wonder if you meet similar problem.
  2. I find a "cycle_opt" loss in train_DASR yml, but actually unused in training. Any special meaning?

Thx again for your work.

csjliang commented 2 years ago

Hi,

Thanks for your question and sorry for the late reply. The pretrained MSRResNet model can be found in the BasicSR project, link: https://drive.google.com/drive/folders/1qgzA7BakP7Y8MCGNK2a4Sh2cAkC012lE.

For the degradation prediction, have you checked the effectiveness of the target degradation parameter? and have you normalized the degradation parameter for easy training? In my case, the predictor can be stable.

For the 'cycle_opt', we are sorry for the misleading. It is a loss that we have tried but deleted. You can just overlook it. Thanks.

orchidmalevolence commented 2 years ago

Hi, thx for your reminding and reply.

I adapt and run _trainDASR.yml in my environment for a few epochs, but find the avg l_regression still only stays around 0.25 :(

INFO: [train..][epoch: 1, iter: 10,800, lr:(1.000e-04,)] [eta: 3 days, 0:57:59, time (data): 0.527 (0.013)] l_pix: 7.5087e-02 l_regression: 2.0657e-01 l_percep: 1.2652e+01 l_g_gan: 1.4202e-01 l_d_real: 5.1343e-01 out_d_real: 1.1895e+00 l_d_fake: 4.5197e-01 out_d_fake: -9.6800e-01 INFO: [train..][epoch: 1, iter: 10,900, lr:(1.000e-04,)] [eta: 3 days, 0:56:32, time (data): 0.525 (0.013)] l_pix: 6.9376e-02 l_regression: 2.2325e-01 l_percep: 1.2195e+01 l_g_gan: 1.2894e-01 l_d_real: 7.4900e-01 out_d_real: 9.3070e-01 l_d_fake: 6.7128e-01 out_d_fake: -6.1778e-01 INFO: [train..][epoch: 1, iter: 11,000, lr:(1.000e-04,)] [eta: 3 days, 0:55:07, time (data): 0.535 (0.013)] l_pix: 6.7170e-02 l_regression: 2.8803e-01 l_percep: 1.1297e+01 l_g_gan: 1.3885e-01 l_d_real: 4.2849e-01 out_d_real: 2.2085e+00 l_d_fake: 5.2274e-01 out_d_fake: -8.6520e-01

As I commented above, I think the predictor only output a random embedding in this case, because the expected value of L1 loss in uniform distribution is 1/3 (degradation distribution should be more concentrated of course).

Here are some of my analysis, I check my training log and find the predictor actually gives much better predictions on blur and noise params, but worse on resize and JPEG params. Here's an example.

[INFO] predicted_params: tensor([5.6190e-01, 4.9756e-01, 5.0243e-01, 4.9990e-01, 3.3823e-13, 3.3490e-13, 3.4622e-13, 3.2713e-13, 3.5963e-13, 3.3968e-13, 4.2745e-01, 2.8342e-12, 2.8198e-12, 2.9235e-12, 3.4760e-13, 3.1584e-13, 3.2878e-13, 3.3205e-13, 5.0778e-01, 7.5701e-12, 1.0840e-01, 9.9986e-01, 3.3148e-13, 3.4470e-13, 3.3532e-13, 3.4616e-13, 5.1279e-01, 3.3276e-13, 3.4435e-13, 3.3630e-13, 2.8137e-12, 2.6319e-12, 2.8021e-12], device='cuda:0') [INFO] d_params: tensor([0.5714, 0.4295, 0.8903, 0.7053, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.4286, 0.0000, 1.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.2135, 0.0000, 0.0000, 1.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.9695, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 1.0000], device='cuda:0')

From my view, it's hard for a predictor to learn resize scale or gray probability only from already degraded images.

Besides, I don't think it's a good idea to make an explicit degradation prediction by L1 loss. Abstract representation from unsupervised learning may be better.

Anyway, I do like the idea to make use of degradation representation to guide dynamic conv or MoE and I will keep trying similar works. Great thx again!