cpsievert / LDAvis

R package for web-based interactive topic model visualization.
Other
557 stars 131 forks source link

Blue bars change their width when topics aren't selected or when different topic is selected #105

Open scherbakovdmitri opened 2 years ago

scherbakovdmitri commented 2 years ago

Maybe I don't get how it works, but I am following the example in the package text2vec:

library(text2vec)
data("movie_review")
N = 500
tokens = word_tokenizer(tolower(movie_review$review[1:N]))
it = itoken(tokens, ids = movie_review$id[1:N])
v = create_vocabulary(it)
v = prune_vocabulary(v, term_count_min = 5, doc_proportion_max = 0.2)
dtm = create_dtm(it, vocab_vectorizer(v))
lda_model = LDA$new(n_topics = 10)
doc_topic_distr = lda_model$fit_transform(dtm, n_iter = 20)
# run LDAvis visualisation if needed (make sure LDAvis package installed)
lda_model$plot()

Notice how for the token "end" the bars are different (one crosses the tick , and the other - does not) Screenshot 2022-03-22 at 01 41 57 Screenshot 2022-03-22 at 01 41 53

This becomes more obvious if you have few tokens in corpus, then the width changes considerably. Any explanation to this? Thanks!