Closed mbahacker closed 8 years ago
doc_cnt and doc_ids have the same structure of online lda code by Matthew D. Hoffman (corresponds to wordcts, wordids in his code): https://github.com/blei-lab/onlineldavb/blob/master/onlineldavb.py n_voca is the size of the vocabulary. n_topics is the number of topics generally used in topic models.
doc_cnt : structure and what is it? doc_ids: is it a 1-D array of all the doc ids? n_voca?
n_topics: I believe items are the actual documents. So, if I have 1000 documents, n_topics will be equal to 1000