SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
68 stars 57 forks source link

Create dataset loader for IndoAbusive #717

Open SamuelCahyawijaya opened 3 months ago

SamuelCahyawijaya commented 3 months ago

Dataloader name: indoabusive/indoabusive.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?indoabusive

Dataset indoabusive
Description IndoAbusive contains tweet data classified as abusive and non-abusive.
Subsets -
Languages ind
Tasks Abusive Language Detection
License Creative Commons Attribution 4.0 (cc-by-4.0)
Homepage https://github.com/fathanick/Dataset-for-Abusive-and-Non-Abusive-Tweet-Identification
HF URL -
Paper URL http://inacl.id/journal/index.php/jlk/article/download/15/16
HilariusJeremy commented 1 month ago

self-assign