microsoft / Swin-Transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
https://arxiv.org/abs/2103.14030
MIT License
13.98k stars 2.06k forks source link

Question about `relative_coords_table` buffer in Swin V2. #329

Open BlueDruddigon opened 1 year ago

BlueDruddigon commented 1 year ago

In Swin Transformer V2, as proposed using Log-Space CPB,

In the code implementation, log-spaced coordinates are generated when transferring across window sizes through relative_coords_table like in this code snippet.

https://github.com/microsoft/Swin-Transformer/blob/f92123a0035930d89cf53fcb8257199481c4428d/models/swin_transformer_v2.py#L109-L111

I have two questions: