Dear Authors,
Thank you for you great work in propose a new channel mixing technique, but I have a question about why don't you expand the dimension of the input in the first EMM operation like MLP channel mixing usually do (first project input to a higher dimensional space and then project back to the origin dimension).
Thank you!
Dear Authors, Thank you for you great work in propose a new channel mixing technique, but I have a question about why don't you expand the dimension of the input in the first EMM operation like MLP channel mixing usually do (first project input to a higher dimensional space and then project back to the origin dimension). Thank you!