SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
65 stars 57 forks source link

Create dataset loader for LEX-INDO: AN INDONESIAN LEXICON #437

Closed SamuelCahyawijaya closed 5 months ago

SamuelCahyawijaya commented 7 months ago

Dataloader name: lex_indo/lex_indo.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?lex_indo

Dataset lex_indo
Description This open-source lexicon consists of 2,000 common Indonesian words, with phoneme series attached.
Subsets -
Languages ind
Tasks Lexical Normalization
License Creative Commons Attribution Non Commercial No Derivatives 4.0 (cc-by-nc-nd-4.0)
Homepage https://magichub.com/datasets/indonesian-lexicon/
HF URL -
Paper URL -
joanitolopo commented 7 months ago

self-assign