trinker / sentimentr

Dictionary based sentiment analysis that considers valence shifters
Other
427 stars 84 forks source link

sentiment of "help" #111

Closed manyu26 closed 5 years ago

manyu26 commented 5 years ago

Why would different forms of "help" yield different sentiment scores?

sentiment("help") element_id sentence_id word_count sentiment 1: 1 1 1 0 sentiment("helps") element_id sentence_id word_count sentiment 1: 1 1 1 0.8 sentiment("helping") element_id sentence_id word_count sentiment 1: 1 1 1 0.5 sentiment("helped") element_id sentence_id word_count sentiment 1: 1 1 1 0.8

manyu26 commented 5 years ago

Similarly,

sentiment("friend") element_id sentence_id word_count sentiment 1: 1 1 1 0.8 sentiment("friends") element_id sentence_id word_count sentiment 1: 1 1 1 0

though I can understand how the lexicon may have an original form of the word (friend) but not other forms (friends). For the case of "help", the original form was not labeled as positive but the higher-level forms (helps, helped...) were all given a positive score.

trinker commented 5 years ago

This is the result of the default lexicon:

lexicon::hash_sentiment_jockers_rinker[c('help', 'helps', 'helping','helped')]
         x   y
1:    help  NA
2:   helps 0.8
3: helping 0.5
4:  helped 0.8

You can change/update the provided lexicons to match your expectations or devise your own custom dictionary; see ?sentimentr::update_key for examples of how to do this.