IndoNLP / nusa-crowd

A collaborative project to collect datasets in Indonesian languages.
Apache License 2.0
261 stars 61 forks source link

Closes #227 Data loader for Karonese sentiment #230

Closed aliakbars closed 1 year ago

aliakbars commented 2 years ago

Please name your PR after the issue it closes. You can use the following line: "Closes #ISSUE-NUMBER" where you replace the ISSUE-NUMBER with the one corresponding to your dataset.

Checkbox

afaji commented 2 years ago

This dataset is a bit noisy at the moment, aside from having inconsistent labeling (numeric vs string), some data has no labels at all. I've sent a PR to that dataset https://github.com/imkarokaro123/karonese/pull/1 in which aside from cleaning the data, also add extra username masking to add some privacy.

SamuelCahyawijaya commented 2 years ago

@afaji : So, let's just use the one from your fork and move forward with the PR, shall we?

aliakbars commented 1 year ago

Waiting for this PR to be approved https://github.com/imkarokaro123/karonese/pull/3

SamuelCahyawijaya commented 1 year ago

Hi @aliakbars : Perhaps we can just use the data from your fork for now, since we couldn't get any update from the author of the dataset

aliakbars commented 1 year ago

Updated. Should be working properly now, @SamuelCahyawijaya @holylovenia @muhsatrio.

SamuelCahyawijaya commented 1 year ago

/test dataset=karonese_sentiment