speakleash / speakleash-instruct-creator

Generate instructions datasets for the fine-tuning purposes.
3 stars 5 forks source link

Emotions - sentiment mapping #85

Open Samox1 opened 1 month ago

Samox1 commented 1 month ago

I have created a simple script that processes data from the GoEmotions dataset (Google): Link: https://github.com/google-research/google-research/tree/master/goemotions Description: GoEmotions is a corpus of 58k carefully curated comments extracted from Reddit, with human annotations to 27 emotion categories or Neutral.

ZIP with script (ipynb), JSONL and parquet. GoEmotions_script_jsonl_parquet.zip