huawei-noah / Efficient-AI-Backbones

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
4.07k stars 708 forks source link

Wave-MLP looks like it uses depth-wise conv #191

Closed Phuoc-Hoan-Le closed 1 year ago

Phuoc-Hoan-Le commented 1 year ago

Hi,

Thank you for your great work on Wave-MLP. I was looking at the code and it uses 1xK/Kx1 depth-wise convolutions. I am not sure how this can be directly translated to pure matrix multiplication or how Wave-MLP is an MLP model?

yehuitang commented 1 year ago

The window is compatible with dense prediction tasks with varying sizes of input images. Please take a look at Section 3.3 for detail.

Phuoc-Hoan-Le commented 1 year ago

@yehuitang I understand that you are required to limit the window size to deal with dense prediction tasks with varying sizes of input images. However, from what I know MLP models such as MLP-mixer, ResMLP, etc, don't have weight sharing among pixels/patches, but they share the weights among channels.

In other words, for MLP-based models and even Swin transformers, each pixel/patch has its own filters, but the filters are shared among the channel dimension.