This is a large-scale cross-lingual summarization dataset containing article-summary samples in 1,500+ language pairs, which includes pairs with the Burmese, Indonesian and Vietnamese languages. Articles in the first language are assigned summaries in the second language.
Subsets
id-my, id-vi, my-id, my-vi, vi-id, vi-my
Languages
ind, vie, mya
Tasks
Abstractive Summarization
License
Creative Commons Attribution Non Commercial Share Alike 4.0 (cc-by-nc-sa-4.0)
Dataloader name:
crosssum/crosssum.py
DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?crosssum