issue about training the model

Paper99 / SRFBN_CVPR19

Pytorch code for our paper "Feedback Network for Image Super-Resolution" (CVPR2019)

MIT License

551 stars 126 forks source link

issue about training the model #37

Open xrjiang527 opened 5 years ago

xrjiang527 commented 5 years ago

===> Training Epoch: [1/1000]... Learning Rate: 0.000100 Epoch: [1/1000]: 100%|#####################################| 1000/1000 [11:54<00:00, 1.40it/s, Batch Loss: 0.5712]

Epoch: [1/1000] Avg Train Loss: 9.177065 ===> Validating... [Set5] PSNR: 12.43 SSIM: 0.0694 Loss: 0.629247 Best PSNR: 12.43 in Epoch: [1] ===> Saving last checkpoint to [experiments/RDN_in3f64_x2/epochs/last_ckp.pth] ...] ===> Saving best checkpoint to [experiments/RDN_in3f64_x2/epochs/best_ckp.pth] ...]

The test results were wrong what should I do to solve the problem? thank you

Paper99 commented 5 years ago

When training RDN, please ensure the rgb_range in your *.json file is 1. If it is, the average training loss (9.177..) for rgb_range=1 is too large. Smaller than 1 is reasonable. My guess is that there is something wrong with your training data.

xrjiang527 commented 5 years ago

I use the Prepare_TrainData_HR_LR.m to generate HR/LR training pairs. When preparing the x2 data, I only change the scale '4' to' 2'. If there is something wrong with training data. thank you!!

Paper99 commented 5 years ago

I re-confirmed the training process of RDNx2. It is OK. My log is shown below:

Method: RDN || Scale: 2 || Epoch Range: (1 ~ 1000)

===> Training Epoch: [1/1000]...  Learning Rate: 0.000100
Epoch: [1/1000]: 100%|██████████| 1000/1000 [05:30<00:00,  2.82it/s, Batch Loss: 0.0297]

Epoch: [1/1000]   Avg Train Loss: 0.068141
===> Validating...
[Set5] PSNR: 28.15   SSIM: 0.9314   Loss: 0.038095   Best PSNR: 28.15 in Epoch: [1]
===> Saving last checkpoint to [experiments/RDN_in3f64_x2/epochs/last_ckp.pth] ...]
===> Saving best checkpoint to [experiments/RDN_in3f64_x2/epochs/best_ckp.pth] ...]

Maybe you can clone the latest code and try again, or regenerate your training data.

Paper99 commented 5 years ago

I found the reason causing your mentioned problems. Comment this line for training a model without Kaiming initialization. This is a very interesting phenomenon for image SR.

Senwang98 commented 3 years ago

@Paper99 have you trained the whole RDN? How about the final result?