Closed StrangeFate closed 2 years ago
Hello!
You can use the get_predicted_topics method to do this :) You should get a list of the predicted topic for each document
Thank you @vinid
I just tried what you suggested me to use and I'm little confused.
In the description of get_predicted_topics, it saids it needs an input of CTM dataset. I wonder what is the CTM dataset in this context.
Also, I tried to use training_dataset as an input dataset of get_predicted_topics and I think it worked since it returns number of topic. In this case, will arrange of the returned topic numbers matches directly to the index of preprocessed_documents(unpreprocessed_documents too)?
Thanks for the quick reply and good work!
exactly! you should pass the data you used to train the model (training_dataset) to it. You can then align it to the original dataset you have :)
exactly! you should pass the data you used to train the model (training_dataset) to it. You can then align it to the original dataset you have :)
Thank you! This works out very well!.
Closing issue!
Hi. I came up with some question regards about the documents in the topic model.
Basically, I'd like to know whether there will be a way to extract the original documents(preprocessed one is fine if it's impossible to extract un-preprocessed one) from a built topic model.
For example, let's say there are thousands of documents and we did topic modeling on these documents. From the result, there is topic A and what I want to do is, I want to see all the documents that are assigned as topic A. So that I can understand documents more deeply than using only keywords to understand that topic.
Thanks for reading!