Topic labeling with Mutual Information

maplejia commented 4 years ago

I read and use your “tidytext”package https://www.tidytextmining.com/topicmodeling.html for online review content analyze, I am I am wondering whether I could ask you about question about how to label topic with top ranked words which have high mutual information score on this topic. Do you know how to realize it with R?

The most of topic modeling seems use top word by proportion in specific topic to label topic. And my result is always hard to explain. In some paper/website, it mentioned use high mutual information word in specific topic to label topic. Below website is a example http://qpleple.com/word-relevance/

I did not find the label method in current LDA package, do you know where can I find it or how to apply it? I hope this method could help me to explain my topic better, most of top word is hard to explain topic which I get from LDA.

juliasilge commented 4 years ago

I have moved most of my IRL topic modeling work over to the stm implementation, which includes some great measures of topic quality like semantic coherence and exclusivity. I wrote about how to measure these in this blog post, and there are links in there to papers which explain these metrics in detail.

juliasilge commented 4 years ago

Let me know if you further questions! 🙌

dgrtwo / tidy-text-mining

Topic labeling with Mutual Information #78