what makes pooling competitive performance or even more than attention?

sail-sg / poolformer

PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)

Apache License 2.0

1.3k stars 117 forks source link

Hi @TheK2NumberOne , thanks for your attention.

Model	MetaFormer	Token mixing ability	Local inductive bias	Params	MACs	Top-1 Acc
ResNet-50	No	Strong	More	26M	4.1G	79.8
PoolFormer-S24	Yes	Weak	More	21M	3.4G	80.3
DeiT-S (Transformer)	Yes	Strong	Less	22M	4.6G	79.8

Compared with ResNet, since the local spatial modeling ability of the pooling layer is much worse than the ResNet, the competitive performance of PoolFormer can only be attributed to its general architecture MetaFormer.
Compared with DeiT, the better performance of PoolFormer may result from the more local inductive bias of pooling.

sail-sg / poolformer