Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality
[bibtex](@inproceedings{lau2014machine,
title={Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality.},
author={Lau, Jey Han and Newman, David and Baldwin, Timothy},
booktitle={EACL},
pages={530--539},
year={2014}
})
General:
A good paper which gives rational about the topics instability
Measures:
notion of topic “coherence”, and proposed an automatic method for estimating topic coherence based on pairwise pointwise mutual information (PMI) between the topic words
direct appraoch, asking people about topics, indirect approach by evaluating PMI, CP.
To create gold-standard coherence judgements, they used Amazon Mechanical Turk
Problems:
perplexity correlates negatively with topic interpretability
Research Question:
word intrusion measures topic interpretability differently to observed coherence
Terminologies:
topic coherence, the semantic interpretability of the top terms usually used to describe discovered topics
“intruder word”, which has low probability in the topic of interest, but high probability in other topics
Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality
[bibtex](@inproceedings{lau2014machine, title={Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality.}, author={Lau, Jey Han and Newman, David and Baldwin, Timothy}, booktitle={EACL}, pages={530--539}, year={2014} })
General:
Measures:
Problems:
Research Question:
Terminologies: