Open knbknb opened 6 years ago
The seed indeed has a big impact on the result of a t-SNE plot due to its use of gradient descent and possibly many local optima it can find.
I would like the authors to provide the seeds they used to generate their subplots. If you click on let's say paragraph 1. Perplexity 2, Steps 5000 it shows a stepwise visualization that will not recreate the thumbnail that you clicked at.
I am by no means an expert on t-SNE, and I think your blogpost https://distill.pub/2016/misread-tsne/ is quite helpful for beginners, like myself.
Now that I have installed R-package Rtsne (https://github.com/jkrijthe/Rtsne), I ran one of its code examples from the package documentation. There I noticed that initialisation with a random seed also is very important. However you don't mention influence of the random seed in your blogpost.
For the small (n= 150) iris dataset, the clusters just seem to be flip-flopped to different edges of the 2d-plane of the plotting area, but for more complex data the plots can look completely different, even when the hyperparameters are held constant and just the init values change.
(Let me know if this is not written clearly enough.)