Closed rbroc closed 6 months ago
I was thinking about this actually. I don't remember if I checked, but I will give it one more look in Top2Vec. My guess is though, that it doesn't change much, since UMAP and TSNE are based on nearest neighbours.
I just checked in Top2Vec, and to me it seems that they also find the relevant words in the original high-dimensional space. (though I would love a sanity check if you still have concerns :smile: )
they do! -- i've opened an issue there bc i am curious why and whether they tried w/ reduced vectors, but totally fine leaving this as is here (thus closing this)
more of a question than an issue, but I noticed that, in
ClusteringTopicModel
, while clustering is performed on reduced vectors, topic centroids and feature importances are computed on the full vectors (pre- dim reduction). Is that the intended behavior?