Open Harzva opened 1 year ago
Describe the question(问题描述) Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts
In the MOE method does expert have to learn and can the frozen model be used as an expert?like gpt3 bert
thanks you very much
Describe the question(问题描述) Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts
In the MOE method does expert have to learn and can the frozen model be used as an expert?like gpt3 bert