PaddlePaddle / PaddleRec

Recommendation Algorithm大规模推荐算法库,包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、Bert4Rec、DeepWalk、SSR、AITM,DSIN,SIGN,IPREC、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESMM、ESCMM, MAML、xDeepFM、DeepFEFM、NFM、AFM、RALM、DMR、GateNet、NAML、DIFM、Deep Crossing、PNN、BST、AutoInt、FGCNN、FLEN、Fibinet、ListWise、DeepRec、ENSFM,TiSAS,AutoFIS等,包含经典推荐系统数据集criteo 、movielens等
https://paddlerec.readthedocs.io/
Apache License 2.0
4.3k stars 722 forks source link

In the MOE method does expert have to learn and can the frozen model be used as an expert?like gpt3 bert #933

Open Harzva opened 1 year ago

Harzva commented 1 year ago

Describe the question(问题描述) Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts

In the MOE method does expert have to learn and can the frozen model be used as an expert?like gpt3 bert

thank you very much!!

wangzhen38 commented 1 year ago

We just reproduced this model with paddlepaddle according to the source code of the paper, so it can't use other frozen model to be an expert directly, but it supports warm start by the model saved in past epochs.