GlobalMaksimum / sadedegel

A General Purpose NLP library for Turkish
http://sadedegel.ai
MIT License
92 stars 15 forks source link

Emoticons :) #264

Open dafajon opened 3 years ago

dafajon commented 3 years ago

Apart from emojis, there are emoticons such as :), ;), :P, xD. Tokenizers tend to dismiss them as punctuations yet they are tokens of emotion.

husnusensoy commented 3 years ago

Please do not mention customer specific projects and their names in here. Do we have complete list of such emoji tokens ?

dafajon commented 3 years ago

This Wikipedia article contains separate list of Western and Eastern (Korean, Japanese etc.) style emoticons. They are also categorized by emotions they convey. This can bu used to build a list and emoticon -> emotion dict.

ertugrul-dmr commented 3 years ago

@dafajon This dict below might be useful for this case:

https://github.com/NeelShah18/emot/blob/master/emot/emo_unicode.py

dafajon commented 3 years ago

This dict also refers to the same Wikipedia page; so we can use this dict. Thanks for sharing.