Westlake-AI / MogaNet

[ICLR 2024] MogaNet: Efficient Multi-order Gated Aggregation Network
https://arxiv.org/abs/2211.03295
Apache License 2.0
162 stars 13 forks source link

Some Questions about the paper #6

Closed 123456789asdfjkl closed 1 year ago

123456789asdfjkl commented 1 year ago

Hi!Thank you for your great work!我对高亮的这段话没能理解,想请教您,这个应该如何理解 image

Lupin1998 commented 1 year ago

Hi, @123456789asdfjkl, thanks for your excellent question! Unfortunately, there is also a typo, and it should be 0-order interaction of each patch itself and 'n-order' interaction covering all patches. Given an image of n patches, the m-order interactions are defined by Eq.1 and Eq. 2 in Representation Bottleneck, where $0\ge m\le n-2$. There are two trivial conditions: (a) When the interaction pair i,j is the same patch, it will be 0-order interaction that the patch interacts with itself, i.e., $Conv{1\times 1}(\cdot)$. (b) When all patches are pooled into a single token, e.g., by global average pooling GAP($\cdot$), it cannot be measured by the concept of m-order interaction (because the contextual set cannot be larger than m-2). Therefore, modelling two trivial conditions by $Conv{1\times 1}(\cdot)$ and GAP($\cdot$), the proposed FD module can intend to help the model learn more useful interactions.

image image

123456789asdfjkl commented 1 year ago

我明白了,感谢您的解答