quanteda / quanteda.textmodels

Text scaling and classification models for quanteda
42 stars 6 forks source link

Add Pang and Lee (2004) movie review polarity dataset #20

Closed kbenoit closed 4 years ago

kbenoit commented 4 years ago

Adds the 2000 movie review dataset with sentiment labels, useful for classification examples and sentiment analysis examples. Also works well with textstat_keyness() and topic model examples.

The source data to create it is also included in tests/data_creation/, including how the README file is tucked into user metadata. (See the example in ?data_corpus_moviereviews.)

Does a bit of re-organization of some of the data documentation as well.