IndoNLP / nusa-crowd

A collaborative project to collect datasets in Indonesian languages.
Apache License 2.0
261 stars 61 forks source link

Create dataset loader for indo wiki paralel corpora #242

Closed SamuelCahyawijaya closed 2 years ago

SamuelCahyawijaya commented 2 years ago

NusaCatalogue: https://indonlp.github.io/nusa-catalogue/card.html?id_wiki_parallel

Dataset id_wiki_parallel
Description Manually aligned parallel corpora from Wikipedia
License Unknown
SamuelCahyawijaya commented 2 years ago

duplicated with https://github.com/IndoNLP/nusa-crowd/issues/228