zzh-tech / ESTRNN

[ECCV2020 Spotlight] Efficient Spatio-Temporal Recurrent Neural Network for Video Deblurring
MIT License
313 stars 39 forks source link

About test code #1

Closed HyeongseokSon1 closed 4 years ago

HyeongseokSon1 commented 4 years ago

Hello, It seems that there is only the training code. Can you provide the test code used for the evaluation in the paper?

zzh-tech commented 4 years ago

I have added the test code for the result generation. Thank you.

HyeongseokSon1 commented 4 years ago

I have added the test code for the result generation. Thank you.

Thank you for uploading the test code. However, I have still some issues with training and test codes. Please comment on these issues.

  1. Test accuracy is much lower than valid accuracy during training. Visual results are also not sharp. For a single GPU model, the test PSNR is 29.83dB while the best validation PSNR is 30.73dB. I trained the model with the old source code.

  2. Model performance (validation accuracies) seems to depend on the number of GPUs used in training. (used ddp mode)

zzh-tech commented 4 years ago

As for issue #1: The PSNR values for our model and other models are all calculated through one round of random cropped (256x256) test dataloader (Not on full resolution video). What we want to compare is the deblurring efficiency of each model under same experimental conditions. Actually, if you want to get sharper results, you could increase the training epochs since the model is not converged at 500 epochs. And you can use larger model configuration, e.g., increasing "n_blocks" and "n_feats" as we did in the paper.

As for issue #2: If you use ddp mode, the real batch size is "num_gpus*batch_size", which means each process (GPU) will use the batch size you set in "para". Batch size will affect the performance of the model. In general, small batch size is better with the same training epochs, but I believe the effect of hyper-paramter "frames" is coupled to batch size. This also applies to other model like IFIRNN.

Thank you.

HyeongseokSon1 commented 4 years ago

Thank you for the fast reply

For issue #2, I see your reply. However, for issue #1, my training setting is the same as the model (B9C80) in the paper, which is trained for 500 epoch and shows the PSNR 30.79 dB. Validation accuracy is similar to the paper's value although its accuracy is calculated from random crops. My test accuracy is much lower than the paper's value.

I think that there may be a problem with the test code. Could you provide the pretrained model or check again that the current test code is valid?

zzh-tech commented 4 years ago

I didn't write the code in the test function to calculate PSNR since the validation and test in GOPRO is same. Could you please send me your code to calculate PSNR and your (B9C80) checkpoint? My email: zzh.tech@gmail.com Thank you

HyeongseokSon1 commented 4 years ago

You mean that PSNR values in Table. 1 in the paper are validation accuracies, calculated from random crops? Thank you

zzh-tech commented 4 years ago

Yes, it may seem strange. But, at 500 epochs, all the models are not converged, I have to choose one value to compare. So I chose the best one of each model. I run all the other models in the same way, from this perspective the comparison is fair. And when you run enough epochs (over 500), the current valid PSNR will be very close to best one. Sorry to confuse you.

HyeongseokSon1 commented 4 years ago

OK I see, my question is settled. Thank you. I will close this issue.