We present our work on developing a multilingual, efficient text-to-texttransformer that is suitable for handling long inputs. This model, calledmLongT5, builds upon the architecture of LongT5, while leveraging themultilingual datasets used for pretraining mT5 and the pretraining tasks ofUL2. We evaluate this model on a variety of multilingual summarization andquestion-answering tasks, and the results show stronger performance for mLongT5when compared to existing multilingual models such as mBART or M-BERT.
URL
Affiliations
Abstract
Translation (by gpt-3.5-turbo)
Summary (by gpt-3.5-turbo)