keras-team / keras-hub

Pretrained model hub for Keras 3
Apache License 2.0
764 stars 230 forks source link

Add the gMLP Encoder Block #98

Closed abheesht17 closed 1 month ago

abheesht17 commented 2 years ago

The gMLP model is from the paper "Pay Attention to MLPs". It has a decent number of citations - around 40. Every Encoder Block merely consists of linear layers, a "spatial gating unit", etc. Will be a good addition to the library, considering the research world is trying to find alternatives for self-attention, and because despite the simplicity of this model, it does achieve comparable performance with Transformers.

chenmoneygithub commented 2 years ago

@abheesht17 Thanks for opening this feature request!

The idea of the paper is definitely interesting! But at this moment I am not convinced that gMLP can be a good replacement to transformer. It claims that less parameters would be required, but we can also control the number of encoders or their SGUs? We will have more discussions over this next week, and you could also add this to your GSoC proposal if you want, thanks again!

abheesht17 commented 2 years ago

Awesome! Thanks, @chenmoneygithub. Will add it to the doc :)