ContinuumIO / topik

A Topic Modeling toolbox
BSD 3-Clause "New" or "Revised" License
93 stars 24 forks source link

Identify desired plots / layout schemes #12

Open msarahan opened 8 years ago

msarahan commented 8 years ago

I'm assuming we'll prefer using Bokeh.

msarahan commented 8 years ago

Cluster visualization (this is LDAVis' strength)

Scaling is critical: number of docs. LDAVis limited in this regard.

Termite plot clusters need to be tied to output of clustering visualization!

LDA: similarity quotient is variable to play with. Highlights significant terms. Nice to have - not in high usage, but cute.

Desire: trending terms over time. Mocked up in Kibana, but Kibana has no route back to raw data. Want ability to go from cluster to raw data that contributed.

Stats to display:

Topic tagging/naming: user specified, but tracked through plotting by name instead of by topic index

gpfreitas commented 7 years ago

For some reason, I can't close this issue.

@msarahan , can you close it? Trying to cleanup my "assigned issues".

gpfreitas commented 7 years ago

Maybe we should just disable the issues in the "Settings" of the repo altogether. I did that with another old project repo, and it worked great (issues were preserved after I re-enabled them as a test... But I'm not sure GH doesn't erase them after a while. I'd be surprised, but if it's important, we should check.)