xinntao / EDVR

Winning Solution in NTIRE19 Challenges on Video Restoration and Enhancement (CVPR19 Workshops) - Video Restoration with Enhanced Deformable Convolutional Networks. EDVR has been merged into BasicSR and this repo is a mirror of BasicSR.
https://github.com/xinntao/BasicSR
1.48k stars 320 forks source link

Difference between moderate and larger models, with and without TSA models #144

Open YoungJoongUNC opened 4 years ago

YoungJoongUNC commented 4 years ago

Hello, I would like to first thank you for sharing your great work! I have some questions regarding the number of channel used in your model.

So my question is that,

1) Regardless of moderate (M) or larger (L) model, what is the purpose of model without and with TSA model?

In your code, you provided training options for moderate model (64 channel) without and with TSA module, and I saw that moderate SR model with TSA module is trained with pretrained weights without TSA. And in the paper section 4.1 Training Datasets and Details, there is a sentence that "We initialize deeper networks by parameters from shallower ones for faster convergence."

Is the purpose of model without TSA model is to initialize weights of the complete model with TSA? OR just for ablation studies to confirm the effect of TSA module?

2) What is the different purpose of moderate model(M) and large(L) model?

In your paper section 3.4 Two-Stage Restoration, there is a sentence that "Specifically, a similar but shallower EDVR network is cascaded to refine the output frames of the first stage".

Is the large model(L - deeper) used for the first stage training and the moderate model (M - shallower) used for the second stage training?

Or is the moderate model is just for demonstrating simpler version, and the larger model is what you actually used for the stage 1 training and the difference between the stage 1 model and stage 2 model is that the number of RB in the reconstruction stage is 40 and 20 in each stage which are mentioned in your paper section 4.1 Training Details (but the number of RB in the PCD alignment is 5 both in stage 1 and stage 2 model, right)?

Thank you!

tongjuntx commented 4 years ago

according to my understanding: 1.the purpose of model without TSA model is to initialize weights of the complete model with TSA 2.stage2 is just for fine-tuning,both moderate model and lager model stage1 can have respective stage2;moderade can use for initialize large model