Any hint why the two emoji approaches are different and in what circumstance which one is better?!

trinker / sentimentr

Dictionary based sentiment analysis that considers valence shifters

Other

427 stars 84 forks source link

In the doc there are two approaches to deal with emojis:

## Emojis
## Not run:
## Load R twitter data
x <- read.delim(system.file("docs/r_tweets.txt", package = "textclean"),
stringsAsFactors = FALSE)
x
library(dplyr); library(magrittr)
## There are 2 approaches

## Approach 1: Replace with words
x %>%
mutate(Tweet = replace_emoji(Tweet)) %$%
sentiment(Tweet)

## Approach 2: Replace with identifier token
combined_emoji <- update_polarity_table(
lexicon::hash_sentiment_jockers_rinker,
x = lexicon::hash_sentiment_emojis
)
x %>%
mutate(Tweet = replace_emoji_identifier(Tweet)) %$%
sentiment(Tweet, polarity_dt = combined_emoji)
## End(Not run)

The result is different. emoji approaches

Is there any hint about why the results are different and in what circumstance which one is better? Thanks!

trinker / sentimentr

Any hint why the two emoji approaches are different and in what circumstance which one is better?! #115