Closed aholovenko closed 1 year ago
Thank you for sharing this. I will have to look into this a bit more since the shift may or may not happen depending on whether HDBSCAN is used or another clustering algorithm that does not have -1 in its possible classes.
Thanks @MaartenGr
I was generating the heatmap using
self.model.visualize_heatmap()
method and have noticed that the visualization doesn't match the distance values. I think, the issue is that you are including-1
topic when gettingself.topic_embeddings_
code, but remove it during thevusualize_heatmap()
functionCould you check, please? https://github.com/MaartenGr/BERTopic/blob/09c1732997f838050c263ad00ad3c9474e816863/bertopic/plotting/_heatmap.py#L93 I guest this provides correct results.