Performance of shared_rel_pos_bias

czczup / ViT-Adapter

[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions

https://arxiv.org/abs/2205.08534

Apache License 2.0

1.27k stars 140 forks source link

Performance of shared_rel_pos_bias #177

Closed bio-mlhui closed 4 months ago

bio-mlhui commented 5 months ago

Hello, wonderful work! I wonder what if we set use_shared_rel_pos_bias=True, in the sense that the relative pos bias table is shared across layers. There will be two ablations: A. inject bias at each attention layer. B. inject bias only at input. What will the performance of A, B, and layer-specific-rel-pos be like?