microsoft / tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation
MIT License
723 stars 93 forks source link

Merge A2A FFN overlapping and 2DH A2A #100

Closed yzygitzh closed 2 years ago