Closed luoshao23 closed 6 years ago
(This is line 215-217 of lda.py, right?)
This implements what you see in Equation 4 in Buntine 2009, translated into numpy. Looking at the first two lines of code you included (the last line is just a normalizer):
self.components_[:, words
gets you a vector of, using Buntine's notation, $\theta{:, words[0]}, \theta{:, words[1]}, ...$Looks like I'm missing the citation to the relevant part of this paper: [WMSM09] H.M. Wallach, I. Murray, R. Salakhutdinov, and D. Mimno. Evaluation methods for topic models. In L. Bottou and M. Littman, editors, Proceedings of the 26th International Con- ference on Machine Learning (ICML 2009), 2009.
It's section 4.1 from that paper, Equation 11. (I'm going to create a pull request to update the docstring to include this citation.)
We're using this proposal distribution directly as an approximation of P(z|w) for the new document. This method was chosen because it's very fast and simple.
I hope this helps. Thank you for reviewing this code! It's great to have a second pair of eyes on it.
Thank you for your explanation. That is what I was looking for. I am happy to do the code reviewing. It makes me feel more aware of this algorithm. Also I wish it could help more people to have better understanding when they use this package.
I am confused of the update process for the function
_transform_single
. In my opinion, the PZS_new should be updated using formula as followsIn this code, it seems the doc-topic matrix is updated
(PZS.sum(axis=0) - PZS + self.alpha)
while the word-topic matrixself.components_[:, words].T
remains unchanged. Can you explain the mechanism behind the code or give some reference. I have read the paper you mentioned in the README doc. However, I still have difficulty in understanding your code, especially for these following lines: