Open PrashanthAadepu opened 5 years ago
Hello, Could you elaborate on what you mean by "data with emojis". Do you mean the emoji alone or with some surrounding text? Because as far as I remember the author of this repo has added emoji support. Thanks
Hi,
I have data like the below sentences.
😍 Love your service period! 😂😂😂😉🤗💕
When I classify the sentences with only emojis its always predicting them as neutral.
Thanks.
Interesting. Does your training data consists of a mixture of emoji and emoji less text? Or do all of them have emojis?
Data has below variants 1, Sentence with no emoji. Ex: Very useful for customers 2, Sentence with text and emoji. Ex: 😍 Love your service period! 3, Sentence with the only emoji. Ex: 😂😂😂😉🤗💕
Thanks.
That's very surprising. Could you share the hyperparameters that you have used so that I can see if something is wrong.
Hello
I am using the below notebook. Tweaked it to classify neutral sentiments also.
I am using below tokenizer to properly tokenize the emojis. https://github.com/google-research/bert/blob/master/tokenization.py
And I also added some emojis in vocab.txt and passing it to model training.
Thanks.
Hey! Any improvements on that aspect? Seems surprising since emojis should be taken into account now by Bert tokenizer. Older version was considering an emoji as UNK token.
Hello
I am using the below notebook. Tweaked it to classify neutral sentiments also.
I am using below tokenizer to properly tokenize the emojis. https://github.com/google-research/bert/blob/master/tokenization.py
And I also added some emojis in vocab.txt and passing it to model training.
Thanks.
Hi Prashanth,
Could you share your code tweaks to classify neutral sentiments? I am starting with the same notebook and am actively struggling with making the same tweaks.
Thank you.
Hi, I am also trying to use BERT on data containing emojis but they are always encoded as
Hello.. could you please share the dataset with emojis.. that would be more helpful..
I am using BERT to do sentiment classification. I am currently classifying into positive, negative and neutral.
I have some data with emojis and it is always classifying them as neutral. I think I am missing something here.
Could someone explain to me how to deal with emoji data to classify them correctly.
Thanks in advance.