bmabey / pyLDAvis

Python library for interactive topic model visualization. Port of the R LDAvis package.
BSD 3-Clause "New" or "Revised" License
1.79k stars 362 forks source link

Visualizing Dynamic Topic Models #66

Open bhargavvader opened 8 years ago

bhargavvader commented 8 years ago

Are there any ideas so far for visualizing Dynamic Topic Models? They are an interesting spin-off of LDA which is time-based and would be pretty cool to visualize. This report sort of plays around with visualizing them, but I'm pretty sure a much more comprehensive job can be done with the same.

bhargavvader commented 8 years ago

Blei's paper on Dynamic Topic Models also has some nice visualizations, and he also illustrates them well in his Google Tech Talk.

I am working on implementing Dynamic Topic Models for gensim through the Google Summer of Code program. Once it is completed, I could try and implement a class for pyLDAvis for visualizing the same, if no one else already has plans for the same.

bmabey commented 7 years ago

Sorry, I never responded to this. This is something I've actually spent some time on. In the past what I've done is created a HDP model and used a sliding window approach across the entire corpus to incrementally adjust the model at different times. This paper does something similar:

Tracking and Connecting Topics via Incremental Hierarchical Dirichlet Processes

This paper also has a neat river visualization which I think is more effective than trying to animate something like LDAvis over time (in general I think resorting to an animation is a last resort in visualization).

This older paper on "topic chains" does something similar but I don't like the approach since it requires refitting models for each slide of the window. You then have to line up previous topics which isn't too bad but you miss out on some of the other cool things you can do, as outlined in the paper above, when your model is being updated over time (e.g. you can better highlight what shifted over time).

http://uilab.kaist.ac.kr/research/topic-chain/

bhargavvader commented 7 years ago

This looks very interesting, thank you for the links. I'll try and get around to this when I get some time on my hands.

Also, the python implementation of DTM in gensim is completed, and I've included a tutorial where I discuss using pyLDAvis to visualise DTM (one time-slice at a time - fairly trivial).

nipunsadvilkar commented 7 years ago

I would like to start visualisation piece. @bhargavvader : I have gone through your ipython notebook where you have explained it very well and I see comparing both models (wrapper & ldaseq) is difficult here. So at the start, I wish to do single pyLDAvis visualisation for two model (which will go to N models in future) where each model will be assigned a different colour (Existing Blue-Red for one model & another combination for the second model).

@bmabey Any inputs on How should I approach and what should be workflow.

[P.S. - I will be doing topic modelling using gensim]

bmabey commented 7 years ago

Hi @nipunsadvilkar, I'm not clear on what you are proposing so I can't say if it would make sense to merge something like that in. So, I would say create a new repo (or fork pyLDAvis) and add a notebook showing what you come up with. We can decide then if it makes sense to merge it into the library.

koljamaier commented 7 years ago

@bhargavvader Hi! Check out this nice tutorial. I adapted the visualization to my dJST project. You can check out the source code here.

Cheers