Open LuigiNixy opened 2 months ago
PSRT is totally different from VRT. PSRT performs alignment using transformer that mimics the basicvsr++'s second order dcn. VRT does not have recurrency therefore, VRT directly performs feature aggregation without alignment across frames using mutual swin MSA and vanilla swin MSA, the parallel warp is more like an explicit alignment, and TMSA itself is an implicit alignment. IART's contribution is the Implicit Alignment part not the architecture, IART uses PSRT.
Both IART and VRT are huge in memory usages.
@ yyhtbs-yye, thanks for your explanation.
Hi LuigiNixy, In the Alignment Study, we actually used a small transformer, which PSRT is based on. You can check out the model files and training configurations in these files: Ablation Study Files. You’ll need to incorporate them with MMEditing to train the network.
If you need the complete but unorganized code, please send me an email separately.
Regards, Kai
Hi,
I'm a beginner in video restoration and I am a bit confused about the Alignment Study part in the paper (Section 5 & Section B.1 in the supp). For all the models listed in Table 1, which part in VRT did you modify? Is it the parallel warping part? What loss function did you use to train? Could you please release the code and the dataset for this part?
Thank you so much.