Closed AlaFalaki closed 4 years ago
RoBERTa
is trained on Masked Language Modeling (MLM)
pretraining objecting. Typically MLM
objective, does good on NLU
(classification, regression etc) like downstream tasks but not as good on generation tasks (summarization, dialog, translation etc). There are some papers out which try to use it but we haven't tried it ourselves.
Saying that, we have another project called BART
, which trains seq2seq model on denoising objective. And it does quite good on summarization. (and is almost on par with RoBERTa on NLU tasks). Here are the instructions to use it.
Hello, I just want to know if it is possible to use the RoBERTa architecture on tasks like abstractive summarization? I couldn't find any clue on the documents and codes.
Thanks in advance.