Jeevesh8 commented 4 years ago

🌟 New model addition

Model description

MASS is a novel pre-training method for sequence to sequence based language generation tasks. It randomly masks a sentence fragment in the encoder, and then predicts it in the decoder. In this way, MASS can jointly train the encoder and decoder to develop the capability of representation extraction and language modeling. This pre-training is very helpful when the encoder and decoder are shared between multiple languages.

Open source status

[x] the model implementation is available : The model is implemented upon fair-seq here.
[x] the model weights are available: Pre-trained model on various language pairs, for unsupervised translation, supervised translation and abstractive summarization are provided on the GitHub repo itself.
[x] Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu are the authors: ( @StillKeepTry , @tobyoup , @xutaatmicrosoftdotcom )

This is my first time contributing to this repository, so forgive me for any mistake . Please let know whether I should do it or not. Also, if anyone wants to come along and help, please let me know that too ! 😀

Jeevesh8 commented 4 years ago

Can also try MP-Net of theirs next .

Jeevesh8 commented 4 years ago

Sorry, just saw the request for MP-Net here . Seems I was behind. So, shall I close this issue, or does anyone still want separate MASS model here ? @RyanHuangNLP

Jeevesh8 commented 4 years ago

@RyanHuangNLP @StillKeepTry , @tobyoup , @xutaatmicrosoftdotcom

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

huggingface / transformers

MASS : A generalization of BERT and GPT #6455

🌟 New model addition

Model description

Open source status