dgrtwo / tidy-text-mining

Manuscript of the book "Tidy Text Mining with R" by Julia Silge and David Robinson
http://tidytextmining.com
Other
1.32k stars 805 forks source link

Correct output-text mismatch #8

Closed drsimonj closed 7 years ago

drsimonj commented 7 years ago

Output shows almost perfect classification for Great Expectations, but greatest misassignment for Pride and Prejudice.

juliasilge commented 7 years ago

Actually, the rows show where the words are coming from and the columns show where the words are being assigned to by the topic modeling. So you look across the row to see how well the words are being classified; Pride and Prejudice is classified very well while Great Expectations isn't. We obviously aren't clear enough in our text/code here, though; I'm going to open an issue so we explain this better! Thanks for pointing this out and for your close reading. 👊🏻

drsimonj commented 7 years ago

Aha, that makes sense. Thanks for explaining!