Made the Python example notebook Python 3 compatible and made the examples a little more interesting. Also made sure to set seeds so it should be directly reproducible
Added functionality so you can add labels to the doc-term matrix columns (terms) more easily after having trained a CorEx model.
Added a new attribute to set labels for the rows (docs), including the ability to set the doc labels after training the CorEx topic model.
Added a "get_top_docs" function which returns documents sorted according to probability or TC. I put a warning under TC because we're still trying to figure out the right way to think about it.
If someone could check to make sure I'm sorting documents correctly, that'd be great. I think everything else should be in order.
If someone could check to make sure I'm sorting documents correctly, that'd be great. I think everything else should be in order.