SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
66 stars 57 forks source link

Create dataset loader for malaysia-tweets-with-sentiment-labels #445

Closed SamuelCahyawijaya closed 7 months ago

SamuelCahyawijaya commented 8 months ago

Dataloader name: malaysia_tweets/malaysia_tweets.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?malaysia_tweets

Dataset malaysia_tweets
Description This tweet data was extracted from tweets in Malaysia based on keywords "social distancing" and "physical distancing". We conducted sentiment analysis to understand public opinions on health messages during the COVID-19 pandemic. Tweets from January 2020 to July 2021 were extracted using Python module snscrape and sentiments were obtained automatically using Polyglot and MALAYA NLP tools due to multilingual data.
Subsets -
Languages zlm, eng
Tasks Sentiment Analysis
License Unknown (unknown)
Homepage https://github.com/sarahjuan/malaysia-tweets-with-sentiment-labels
HF URL -
Paper URL https://link.springer.com/chapter/10.1007/978-981-16-8515-6_44
R-Damanhuri commented 8 months ago

self-assign

R-Damanhuri commented 8 months ago

image

Hello, this is my first time trying to contribute by creating a data loader. I have followed the instructions in the CONTRIBUTING.md, but I encountered an error like this. @holylovenia @SamuelCahyawijaya

holylovenia commented 8 months ago

image

Hello, this is my first time trying to contribute by creating a data loader. I have followed the instructions in the CONTRIBUTING.md, but I encountered an error like this. @holylovenia @SamuelCahyawijaya

Could you please try following this template skeleton?