trinker / sentimentr

Dictionary based sentiment analysis that considers valence shifters
Other
426 stars 84 forks source link

consider emotion detection function #93

Closed trinker closed 5 years ago

trinker commented 5 years ago

use lexicon::nrc_emotions and similar negation handling as sentiment

trinker commented 5 years ago

Evidence is not conclusive:

library(sentimentr)

dats <- c( 
    "crowdflower_deflategate", 
    "crowdflower_products", 
    "course_evaluations", 
    "crowdflower_self_driving_cars", 
    "crowdflower_weather", 
    "hotel_reviews", 
    "kaggle_movie_reviews", 
    "cannon_reviews", 
    "kotzias_reviews_amazon_cells"
) 

cdat <- combine_data(dats[c(1:7, 9)])

sdat <- get_sentences(cdat)
cl::tic()
swears <- profanity(sdat, profanity_list = c('fucking', 'fuckin'))
cl::toc()

library(data.table)
swears[profanity > 0, ]

Needs more formal counting of how the f word is used.

trinker commented 5 years ago

fuckin only used 6 times...still it should probably be treated the same as the -ing version though in actual use these two words, when spoken, often have different conotations I suspect.

trinker commented 5 years ago

the main emotion function has been built in the same way prfanity is.

To do:

trinker commented 5 years ago

Issue 1

Note that this gives no warning and should:

emotion(text.var, un.as.negation.warn = TRUE)

Issue 2

Also this gives the following error:

emotion('')
 Show Traceback

 Rerun with Debug
 Error in `[.data.table`(emo_dat[, `:=`(negator_loc, ifelse(is_negator,  : 
  Column 1 of j's result for the first group is NULL. We rely on the column types of the first result to decide the type expected for the remaining groups (and require consistency). NULL columns are acceptable for later groups (and those are replaced with NA of appropriate type and recycled) but not for the first. Please use a typed empty vector instead, such as integer() or numeric().
trinker commented 5 years ago

It doesn't make sense to combine negated anger with fear within Plutchik's wheel. Not doing combine.negated.emotions

trinker commented 5 years ago

Ensure that 'ununhappy' isn't created by un.as.negation as this slows the searching down (ABOVE)

turns out this isn't really likely to happen, isn't slowing it down that much if it does. The time to find and the conditional logic to deal with it takes more time. And it's not going affect the accuracy except to the positive where, in rare cases, something like ununhappy does exist in the text, in which case we'd want to negate the unhappy anyway.