lucidrains / mixture-of-experts

A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
MIT License
628 stars 49 forks source link

convolution operation #8

Open Yonsun-w opened 2 years ago

Yonsun-w commented 2 years ago

Hello, may I ask if my input is (batch, channel, h, w), which is a picture, and each expert corresponds to a convolution operation, how can I implement it?