issues
search
deepseek-ai
/
DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
MIT License
3.47k
stars
143
forks
source link
Add MoE offloading strategy?
#28
Open
Minami-su
opened
4 months ago
Minami-su
commented
4 months ago
https://arxiv.org/abs/2312.17238
https://arxiv.org/abs/2312.17238