SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
66 stars 57 forks source link

Create dataset loader for Sentiment-Annotated Taglish Product and Service Reviews (SentiTaglish: Products and Services) #720

Closed SamuelCahyawijaya closed 2 months ago

SamuelCahyawijaya commented 3 months ago

Dataloader name: sentiment_taglist_product_review/sentiment_taglist_product_review.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?sentiment_taglist_product_review

Dataset sentiment_taglist_product_review
Description Sentiment-Annotated Taglish Product and Service Reviews (SentiTaglish: Products and Services) is a gold standard, sentiment-annotated corpus for the Tagalog-English language pair. The data set was created using publicly available online service and product reviews from Google Maps Reviews and Shopee Philippines. It contains 10,510 product and service reviews which were manually labeled by three human annotators according to four sentiment classes: Positive, Negative, Neutral, and Mixed.
Subsets -
Languages tgl, eng
Tasks Sentiment Analysis, Text Classification
License Creative Commons Attribution 4.0 (cc-by-4.0)
Homepage https://huggingface.co/datasets/ccosme/SentiTaglishProductsAndServices
HF URL https://huggingface.co/datasets/ccosme/SentiTaglishProductsAndServices
Paper URL -
ElijahGut commented 2 months ago

self-assign