lfmatosm / embedded-topic-model

A package to run embedded topic modelling with ETM. Adapted from the original at: https://github.com/adjidieng/ETM
MIT License
85 stars 8 forks source link

Word Vectors and Distribution #7

Closed Sara-Online closed 1 year ago

Sara-Online commented 3 years ago

Hi! I successfully used this package but how can I get the word vector, topic vector and the distribution? Thanks

lfmatosm commented 1 year ago

Hi @Sara-Online, sorry for the (very) late response. Thanks for using this package!

You can get the topic words with this method. Note that you can select how many word per topic you're interest in:

t_w_mtx = etm_instance.get_topics(top_n_words=20)

You can get the topic word matrix with this method. Note that it will return all word for each topic:

t_w_mtx = etm_instance.get_topic_word_matrix()

You can get the topic word distribution matrix and the document topic distribution matrix with the following methods, both return a normalized distribution matrix:

t_w_dist_mtx = etm_instance.get_topic_word_dist()
d_t_dist_mtx = etm_instance.get_document_topic_dist()

I will update the documentation soon detailing these methods.

Cheers, Luiz

lfmatosm commented 1 year ago

Documentation updated with #15