MassiveSum Summarization dataset

MassiveSum: A large summarization dataset for 92 languages - 13 Indian languages with ~1.9million article summary pairs
The sources have been curated manually, and articles downloaded form archive.org
The summaries are mined from metadata information in the HTML like meta tags like 'og:description': hence supposedly diverse. However, I see that the headline and metadata title is the same for some Indian language websites I checked.
So this dataset could be more like a headline generation set

AI4Bharat / indicnlp_catalog