elbamos / largeVis

An implementation of the largeVis algorithm for visualizing large, high-dimensional datasets, for R
340 stars 63 forks source link

LargeVis vs tSNE clustering #56

Closed akhst7 closed 3 years ago

akhst7 commented 6 years ago

I have this data frame of 20,000 options with 14 distinct parameters. A standard Rtsne generates relatively well distributed and balanced clusters as follows; rplot

However, I have been struggling to generate the similar plot. I've been playing with K and n_tree but have not quite get the plot that comes close to tSNE's.

e.g. v<-largeVis(test, K=200, n_tree=200, distance_method = "Eucledian", threads = 16)

rplot01

I'd appreciate if you could give me some pointer to start (stackoerflow was not helpful).

elbamos commented 6 years ago

That looks to me like a plot of an hdbscan clustering, possibly over a largevis dimensionality reduction.

If so, I’m not sure what the objection is to the largevis plot. Is it that there’s a bunch of biggish clusters rather than a larger number of smaller ones? If so you could try reducing K. You might also consider scaling the data.

If you were looking for a particular result, you might add dimensionality to help you get there, effectively semi-supervising.

But in general, it’s important to remember that neither tsne nor largevis guarantees an outcome that’s either aesthetically pleasing or produces whatever separation was desired a priori.

On May 9, 2018, at 6:28 PM, akhst7 notifications@github.com wrote:

I have this data frame of 20,000 options with 14 distinct parameters. A standard Rtsne generates relatively well distributed and balanced clusters as follows;

However, I have been struggling to generate the similar plot. I've been playing with K and n_tree but have not quite get the plot that comes close to tSNE's.

e.g. v<-largeVis(test, K=200, n_tree=200, distance_method = "Eucledian", threads = 16)

I'd appreciate if you could give me some pointer to start (stackoerflow was not helpful).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.