Closed eijoac closed 7 years ago
Sentiment analysis by word
"Error in summarize(., occurences = n(), contribution = sum(score)) : argument "by" is missing, with no default"
@xkuang This isn't related to this issue, but the problem is that you have another package loaded after dplyr (most likely Hmisc) that has a summarize
function that masks dplyr's. If you type summarize
you'd see, and if you restart and be sure to load Hmisc before loading dplyr this would be fixed. See here for more!
I'm going to close this issue, because although stemming is an important NLP task, there are other packages that implement it in R and we don't focus on it in this book. We do plan to add examples with stemming in a vignette for tidytext eventually.
For some of the analyses in the book, it's better to stem the words first. For example, in the analysis of inauguration speeches in chapter 6, it makes more sense to group together words like job/jobs, union/unions, constitution/constitutions, etc. before tf-idf calculation and frequency time series plot.
I understand that stemming is not integrated in the tidytext package for a good reason (https://github.com/juliasilge/tidytext/issues/17). Perhaps that's why you try to avoid stemming in the book?