Closed leungi closed 5 years ago
Hi. First, sorry it took me so long to get to this. Second, I don't know that there's anything I can do for you here, at least not without a reproducible example. If you can get me one, I'd happily take a look and see if it's an issue with textmineR, LDAvis, or something else. Maybe use the nih_sample_topic_model
that ships with textmineR?
Anyway, closing for now. Feel free to re-open if you have that reproducible example I can work off of.
Sorry, I was being dumb. You did give me an example using the nih_sample
data.
I'm not sure how LDAvis sorts terms. Maybe take this issue up there?
Glad you're able to respond @TommyJones; thanks!
I'll try to cross-post this with LDAvis.
So after playing with this for a couple minutes, it looks like their index is off. From the example I have, the LDAvis topic 1 is referencing t_12. LDAviz topic 2 references t_19. I haven't checked others, but that's odd behavior.
That seems to be the case from my memory.
Thanks for investigating!
Apologies for the non
reprex
(due to size), but below is code using example from thetextmineR
package, so it should be reproducible.Issue: reviewing
model$summary
to for, say, topic 1t_1
, it seems that it doesn't match with thet_1
marked inLDAvis
plot.I believe the definitions of
phi
P(token|topic) andtheta
P(topic|document) are the same acrosstextmineR
andLDAvis
, so I'd expect similar topic/word clusters.