dgrtwo / tidy-text-mining

Manuscript of the book "Tidy Text Mining with R" by Julia Silge and David Robinson
http://tidytextmining.com
Other
1.31k stars 806 forks source link

Ch. 6 reorder confusion in top terms plots #45

Closed damonbayer closed 5 years ago

damonbayer commented 6 years ago

I spent some time today confused about the top terms plots in chapter 6 - Topic modeling (Fig 6.2 and Fig 6.4).

I assumed that mutate(term = reorder(term, beta)) would result in the bar plots, for each topic, being plotted in descending order of height, so that the viewer could quickly see the order of the most important words for each topic, but it does not. "new" in Fig 6.2 and "pip" in Fig 6.4 are in the "wrong" place.

I think this might be the intent of the code, although it appears @dgrtwo is aware of this behavior and has previously devised a solution for ordering factors this way. If this is not the intent, it is unclear to me why one would reorder the terms this way.

Using reorder_within would likely be too complex of a solution to address this minor issue, but maybe it would make sense to remove this line from the code? It either claims to order the terms in a way that that isn't reflected in the plot, or, if you are already familiar with how ggplot2 reorders factors in facetted plots, it reorders the terms for no obvious reason. With the line removed, the terms would be ordered alphabetically, which makes more sense than the way they are presented now.

juliasilge commented 5 years ago

The PR in tidytext juliasilge/tidytext#110 has now been merged so I'll make some updates to the book once the new version of tidytext is on CRAN.