Closed cpis7 closed 2 weeks ago
The definition of router can be found in Eq.2, and you can implement it in any way you like, adjusting its parameter size according to your task. In our original implementation, we used the Eq.3 to implement the router, where $\theta
$ and $\phi
$ are instantiated as two 2-layer MLPs (Sec.3.3).
Can you provide some code reference? Thanks
Can you provide some code reference? Thanks
hi, the code will be available next month.
Thanks for your great contribution! Could you explain more about the router implementations? I tried to find the details on router but it couldn't find that on the paper, What kind of network you used for the router network?