Closed utterances-bot closed 2 years ago
Hey. This code and article is great use when learning the nitty gritty of how to write out an LDA model.
I have a quick question, when you re-sample the topic for a given word:
new_z = np.random.multinomial(1, p_z).argmax()
you select the max value, instead of choosing one topic given the probabilities, for example in
new_z = np.random.choice(np.arange(10), p=p_z)
why do you do this? I think you could be in danger of not escaping the initial random topic assignment?
Thanks!
Hey, thanks for the great article. I think there is a typo in your LateX formula of the Gibbs sampling: it misses a '+' in the nominator of the first fraction.
Thanks a lot for the hint! Fixed it now :)
Latent Dirichlet allocation from scratch
Today, I’m going to talk about topic models in NLP. Specifically we will see how the Latent Dirichlet Allocation model works and we will implement it from scratch in numpy. What is a topic model? Assume we are given a large collections of documents.
https://www.depends-on-the-definition.com/lda-from-scratch/