Open ShyamGanesh13 opened 10 months ago
If you want to add words, then it might be worthwhile to do so by adding those words to topic_model.topic_representations_
. That variable contains the core representations. If, however, you know these words to appear in the data, you can also seed
them such that they are likely to appear in the resulting topic representations.
If you want to add words, then it might be worthwhile to do so by adding those words to
topic_model.topic_representations_
. That variable contains the core representations.
If we are adding those words to topic_model.topicrepresentations. Is there any way to calculate the score associated with those words? 0: [['cat', 0.40245479345321655], ['cats', 0.3927491307258606], ['paws', 0.3698537349700928], ['kitten', ?], ['cute', ?]]
You can extract the values from topic_model.c_tf_idf_
and use topic_model..vectorizer_model.get_feature_names_out()
to find the right indices of the values.
Hi @MaartenGr , Is there any option like I can add my own set of words to an topic generated by an BERTopic model?
Like assume, I have a 2 topics with topic_labels 1_cat_cats_paws and 2_dog_dogs_puppy generated from my dataset . Now can I add some extra words to these topics like 1_cat_cats_paws_kitten_cute and 2_dog_dogs_puppy_bark
Note :- Here, words like kitten, cute and bark are my words(not generated by the model) that I need to add in the topics already created by the BERTopic model...