Open rap8 opened 2 months ago
Could you also share your full code along with the version of BERTopic you are using? How many documents are you passing to BERTopic?
If there are indeed only -1 topics, then the way to avoid that is to increase the number of topics you are generating. Most likely, and it is difficult to say without knowing your full code, you will need to increase the value of min_topic_size
or its equivalent in HDBSCAN, min_cluster_size
.
Could you also share your full code along with the version of BERTopic you are using? How many documents are you passing to BERTopic?您能否分享您的完整代码以及您正在使用的 BERTopic 版本?您向 BERTopic 传递了多少文档?
If there are indeed only -1 topics, then the way to avoid that is to increase the number of topics you are generating. Most likely, and it is difficult to say without knowing your full code, you will need to increase the value of
min_topic_size
or its equivalent in HDBSCAN,min_cluster_size
.如果确实只有 -1 个主题,那么避免这种情况的方法是增加您生成的主题数量。最有可能的是,在不知道完整代码的情况下很难说,您将需要在 HDBSCAN 中增加 的min_topic_size
值或其等效值。min_cluster_size
Hi, I have a question. The get_topic_info function prints only 3 representative documents for each topic. So how do I know which topic different documents belong to? I can't seem to find it in your introduction document.
So how do I know which topic different documents belong to? I
The output of .fit_transform
gives you the topic
variable which contains the assignment of a topic to a document. You can also find this assignment in the topic_model.topics_
attribute.
So how do I know which topic different documents belong to? I那么我如何知道不同的文档属于哪个主题呢?我
The output of
.fit_transform
gives you thetopic
variable which contains the assignment of a topic to a document. You can also find this assignment in thetopic_model.topics_
attribute.的.fit_transform
输出为您提供一个topic
变量,该变量包含将主题分配给文档。您还可以在topic_model.topics_
属性中找到此分配。
thanks. I see.
This error occurred when I was processing some word documents. Sometimes this error occurs and sometimes it does not. I looked at the error message. In the _bertropic.py file, line 4024, this error will appear when the topics are all -1. , because the unique_topics parameter is empty, is there any way to avoid this error.?![image](https://github.com/MaartenGr/BERTopic/assets/110172520/97050ae1-263a-4bff-a311-74f99589dfce)