bwconrad / soft-moe

PyTorch implementation of "From Sparse to Soft Mixtures of Experts"
Apache License 2.0
42 stars 3 forks source link