SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
66 stars 58 forks source link

Create dataset loader for PRDECT-ID #276

Closed SamuelCahyawijaya closed 8 months ago

SamuelCahyawijaya commented 10 months ago

Dataloader name: prdect_id/prdect_id.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?prdect-id

Dataset prdect-id
Description PRDECT-ID Dataset is a collection of Indonesian product review data annotated with emotion and sentiment labels. The data were collected from one of the giant e-commerce in Indonesia named Tokopedia. The dataset contains product reviews from 29 product categories on Tokopedia that use the Indonesian language. Each product review is annotated with a single emotion, i.e., love, happiness, anger, fear, or sadness. The group of annotators does the annotation process to provide emotion labels by following the emotions annotation criteria created by an expert in clinical psychology. Other attributes related to the product review are also extracted, such as Location, Price, Overall Rating, Number Sold, Total Review, and Customer Rating, to support further research.
Subsets -
Languages ind
Tasks Sentiment Analysis, Emotion Classification
License Creative Commons Attribution 4.0 (cc-by-4.0)
Homepage https://data.mendeley.com/datasets/574v66hf2v/1
HF URL -
Paper URL https://www.sciencedirect.com/science/article/pii/S2352340922007612?via%3Dihub
ljvmiranda921 commented 10 months ago

self-assign