kai422 / IART

[CVPR 2024 Highlight] Enhancing Video Super-Resolution via Implicit Resampling-based Alignment.
175 stars 15 forks source link

Questions about Alignment Study #13

Open LuigiNixy opened 2 months ago

LuigiNixy commented 2 months ago

Hi,

I'm a beginner in video restoration and I am a bit confused about the Alignment Study part in the paper (Section 5 & Section B.1 in the supp). For all the models listed in Table 1, which part in VRT did you modify? Is it the parallel warping part? What loss function did you use to train? Could you please release the code and the dataset for this part?

Thank you so much.

yyhtbs-yye commented 2 months ago

PSRT is totally different from VRT. PSRT performs alignment using transformer that mimics the basicvsr++'s second order dcn. VRT does not have recurrency therefore, VRT directly performs feature aggregation without alignment across frames using mutual swin MSA and vanilla swin MSA, the parallel warp is more like an explicit alignment, and TMSA itself is an implicit alignment. IART's contribution is the Implicit Alignment part not the architecture, IART uses PSRT.

Both IART and VRT are huge in memory usages.

kai422 commented 2 months ago

@ yyhtbs-yye, thanks for your explanation.

Hi LuigiNixy, In the Alignment Study, we actually used a small transformer, which PSRT is based on. You can check out the model files and training configurations in these files: Ablation Study Files. You’ll need to incorporate them with MMEditing to train the network.

If you need the complete but unorganized code, please send me an email separately.

Regards, Kai