Closed h4nwei closed 1 year ago
Hi Dr Han:
Thank you for your interest.
I think it can be understood in an inverted way: the Patch Merging of Swin comes first, and then we design the fragments to make sure the merging only happens inside our mini-patches, rather than across them. And in general, the Patch Merging can be seen as a pooling layer with stride and kernel both at (2,2), so we call it a pooling layer (a special one, though, that can be matched) in our paper.
To perform ablation study, we simple modify the hyper-parameters during the sampling to make them not aligned.
Hope this can help.
Best, Haoning
Hi @teowu ,
Thanks for solving my puzzle on the match constraint.
Best, Hanwei
Hi @teowu
Thanks for the interesting work. I have some questions regarding the match constraint:
Thanks for any help you can provide.
Hanwei Best