rayleizhu / BiFormer

[CVPR 2023] Official code release of our paper "BiFormer: Vision Transformer with Bi-Level Routing Attention"
https://arxiv.org/abs/2303.08810
MIT License
499 stars 40 forks source link

No PatchEmbeding Issue #51

Closed Flash-Alita closed 5 months ago

Flash-Alita commented 5 months ago

Hello, there: I have noticed that all biformer models take nchw inputs, including the old_legacy version. Can I ask you why you don't adopt the PatchEmbeding method? And what do you think about PatchEmbeding? WIll transformers preform better without PatchEmbeding? Do you have some conclusion? thx...