dair-ai / research_emotion_analysis

:smile: Multilingual emotion analysis research
19 stars 5 forks source link

Q: For what languages do we want to collect data for? #5

Open omarsar opened 4 years ago

omarsar commented 4 years ago

Please include the languages that you think we should collect data for. If you have experience working in a specific language, that will be useful and you can propose collecting emotion-related data in that language.

omarsar commented 4 years ago

I have worked with both English and Spanish. I am also looking at my dialect, Creole.

fmplaza commented 4 years ago

In my PhD I'm working with both English and Spanish too, but I focus more on Spanish as it is my mother tongue. I have experienced in collecting Twitter messages.

maraimm commented 4 years ago

I will contribute for Arabic

Maybe it is good to discuss the data collection in the next meeting. Are we creating new resources or make use of the existing ones?

KhalidAlt commented 4 years ago

I would like to contribute in both Arabic and English.

omarsar commented 4 years ago

In my PhD I'm working with both English and Spanish too, but I focus more on Spanish as it is my mother tongue. I have experienced in collecting Twitter messages.

@fmplaza do you know of any large-scale dataset for Spanish? I haven't come across any.

omarsar commented 4 years ago

I will contribute for Arabic

Maybe it is good to discuss the data collection in the next meeting. Are we creating new resources or make use of the existing ones?

@maraimm we are creating new resources. I will emphasize on the data collection part next meeting. Thanks. Arabic data will be great as well. Have you looked around to see if there any existing datasets for emotion recognition?

Maybe @KhalidAlt feel free to share any information you come across.

Let's have some updates on this for our next meeting.

KwasiArhin commented 4 years ago

I will look up to see if there are any datasets with TWI that i can find.. other I can only participate with English haha

fmplaza commented 4 years ago

In my PhD I'm working with both English and Spanish too, but I focus more on Spanish as it is my mother tongue. I have experienced in collecting Twitter messages.

@fmplaza do you know of any large-scale dataset for Spanish? I haven't come across any.

@omarsar I know three different emotion datasets for Spanish labeled at tweet level but they don't include a large data set:

  1. EmoEvent: A Multilingual Emotion Corpus based on different Events. I'm one of the authors of this paper, it has been recently published in the LREC conference. The Spanish version of EmoEvent dataset contains 8,409 tweets. Labels: anger, fear, sadness, joy, disgust, surprise, other.

  2. Datasets from SemEval-2018 Task 1: Affect in Tweets AIT dataset comprises the datasets used in two subtasks:

    1. E-c Multi-Label Classification. The dataset contains 7,094 tweets but it is a Multi-Label Classification Dataset. Labels: anger, anticipation, disgust, fear, joy, love, optimism, pessimism, sadness, surprise, trust, neutral, or no emotion.

    2. EI-oc (emotion intensity ordinal classification) and EI-reg (emotion intensity regression) subtasks. The dataset contains 7,953 tweets. Labels: anger, fear, sadness, joy.

maraimm commented 4 years ago

For Arabic, there are many efforts and most of them result in small-sized datasets: The following are the datasets I found in the first phase of the search.

omarsar commented 4 years ago

@maraimm those are great findings. Do you mind giving us a short overview of your findings in the next meeting? It doesn't have to be a long presentation. We would just like an update.

maraimm commented 4 years ago

@omarsar Yeah, sure. I am not sure when is the next meeting, date and time?

omarsar commented 4 years ago

@maraimm it's scheduled for next Saturday (25 July 2020 - 15:00 CEST). I will send the zoom link in our Slack group.

maraimm commented 4 years ago

Thanks, @omarsar. Unfortunately, I am not sure I will be able to join the call on Saturday. In case, I was not able, shall I prepare something today to share it with the team tomorrow? (Summary for example)

Will the session be recorded?

omarsar commented 4 years ago

@maraimm the summary would be excellent. If it's a recording even better then I can share it with the group when we meet again. All sessions are being recorded.

maraimm commented 4 years ago

Hi @omarsar,

I emailed you a short recording.

Thanks

omarsar commented 4 years ago

@maraimm Thank you for the video recording. I have added it to our meeting notes.

cahya-wirawan commented 4 years ago

Hi, Sorry for coming late. I found a paper from late 2018 about emotion classification on indonesian twitter: https://www.researchgate.net/publication/330674171_Emotion_Classification_on_Indonesian_Twitter_Dataset They collected and annotated 7500 tweets with 5 emotions: love, joy, anger, sadness, and fear.

rfazeli commented 3 years ago

I can collect data for Persian