dgrtwo / tidy-text-mining

Manuscript of the book "Tidy Text Mining with R" by Julia Silge and David Robinson
http://tidytextmining.com
Other
1.31k stars 806 forks source link

Fix relative word frequencies graph in section 2.4 #14

Closed BarkleyBG closed 7 years ago

BarkleyBG commented 7 years ago

There are two commits which make the relative word frequencies graph at the end of section 2.4 a little bit better.

The first commit calculates the denominator for relative word frequencies before joining datasets; otherwise, the inner_join() will exclude some words from each of the authors, thereby reducing the denominator (especially important for Austen).

The second commit is a small improvement by reducing the whitespace in the faceted graph.

juliasilge commented 7 years ago

@BarkleyBG Thank you so much for your suggestion in this PR. We took a very similar approach in our edits to this part of Chapter 2 and now the word frequencies are the same for the same word for Austen in each plot. We so appreciate the close and thoughtful reading! :raised_hands: