dselivanov / text2vec

Fast vectorization, topic modeling, distances and GloVe word embeddings in R.
http://text2vec.org
Other
850 stars 135 forks source link

Topic Numbers/Labels in LDAvis #266

Closed fishdontfly closed 6 years ago

fishdontfly commented 6 years ago

Thank you for creating such a useful package!

I recently noticed that the topic numbers that result from lda_model$plot() differ from those that lda_model$get_top_words() produces.

For example, if I run lda_model$plot(n=5, lambda =0.6), the bubble for topic #1 actually aligns perfectly with topic #15 from lda_model$get_top_words(). I checked and the plot is not reordering topics, so I cannot figure out why this is happening.

Ideally I'd like to produce a list of the top terms for each topic using lda_model$get_top_words(), and also use the LDAvis plot from lda_model$plot() to explore them even further, but this is difficult to do without the topic numbers matching.

dselivanov commented 6 years ago

Hi @fishdontfly. This is due to the fact that LDAvis CRAN version is dated 2015. Please install latest from github (devtools::install_github("cpsievert/LDAvis")) - after that it will work as you expect.

This is related to #233 - once it will be rewritten in text2vec we won't be depend on unreleased version of LDAvis.

fishdontfly commented 6 years ago

Thank you @dselivanov, that did the trick and I appreciate your fast response! Fixes #266