Open dbl001 opened 5 years ago
I have experimented with adjustments to the 'lda_loss' function: E.g. Lda2vec.py:
normalized = tf.nn.l2_normalize(self.mixture.topic_embedding, axis=1) loss_lda = self.lmbda * fraction * self.prior() + (self.learning_rate*tf.reduce_sum(tf.matmul(normalized, normalized, adjoint_b = True, name="topic_matrix")))
This change to the lda-loss learning algorithm reduces the correlation between topics in the topic_embedding matrix.
Also, this NIPS paper discusses a methodology for quantifying LDA performance, specifically, by measuring: word intrusion and topic intrusion.
http://users.umiacs.umd.edu/~jbg/docs/nips2009-rtl.pdf
Please experiment and let me know what you find.
Topic Similarity Matrix after 33 Epochs:
I have experimented with adjustments to the 'lda_loss' function: E.g. Lda2vec.py:
This change to the lda-loss learning algorithm reduces the correlation between topics in the topic_embedding matrix.
Also, this NIPS paper discusses a methodology for quantifying LDA performance, specifically, by measuring: word intrusion and topic intrusion.
http://users.umiacs.umd.edu/~jbg/docs/nips2009-rtl.pdf
Please experiment and let me know what you find.
Topic Similarity Matrix after 33 Epochs: