Some words in Hu&Liu dict have off polarity values

Hello Tyler, all,

First of all, thank you on amazing work on this package!

I have a question regarding the Hu&Liu (2004) dictionary loaded into the package. Doing a simple frequency check returned the following distribution of words:

Total Observations in Table: 6874

| -2 | -1.05 | -1 | 0 | 1 |

| 7 | 6 | 4824 | 13 | 2024 |

Thus it appears that 7 words have polarity of -2, 6 words polarity of -1.05 and 13 polarity of 0. E.g. "i wish" or "unduly" both carry a -2 weight and "is like" and "i'm like" carry a 0.

Consulting the dictionary avaialable at: https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html, none of these seem to appear there. In fact, I did not see any two-word items in the dictionary there.

While these do make sense, I wanted to ask about why they are included and if these weights are indeed accurate and as intended? How are these bigrams accounted for in the calculation? Or am I doing something wrong?

Thanks!

trinker / sentimentr

Some words in Hu&Liu dict have off polarity values #114