DmitryUlyanov / Multicore-TSNE

Parallel t-SNE implementation with Python and Torch wrappers.
Other
1.89k stars 229 forks source link

Results differ from scikit-learn implementation #8

Open areshytko opened 7 years ago

areshytko commented 7 years ago

t-sne is inherently randomized but still not that much. It produces consistently different (much worse) results compared to scikit-learn Barnes-Hut implementation.

Example on IRIS dataset:

Scikit-learn with default parameters and learning rate 100 original

Multicore T-SNE with default parameters and learning rate 100

multicore

The greater distance of setosa cluster is also supported by general statistical properties of the dataset (and other embedding algorithms) so the results of scikit-learn lib are more consistent with the original manifold structure

DmitryUlyanov commented 7 years ago

Did you try py_bh_tsne or any other non sk-learn package? Do they also produce a worse result? There can be some implementation differences, default parameters and so on. This repo uses py_bh_tsne as the base, I fixed some errors there, but it still can be imperfect. I would give it another try and check the implementation, but I hope sk-learn guys will improve their t-sne efficiency earlier, making this repo useless (that is how it should be).

areshytko commented 7 years ago

Yes, unfortunately sk-learn's t-sne is unusable now except for such toy datasets. Yes, that's strange the output shows that the algorithm quickly converged to a low error and stopped any progress further.


Learning embedding...
Iteration 50: error is 43.405481 (50 iterations in 0.00 seconds)
Iteration 100: error is 44.709520 (50 iterations in 0.00 seconds)
Iteration 150: error is 43.567784 (50 iterations in 0.00 seconds)
Iteration 200: error is 42.564679 (50 iterations in 0.00 seconds)
Iteration 250: error is 1.118502 (50 iterations in 0.00 seconds)
Iteration 300: error is 0.238091 (50 iterations in 0.00 seconds)
Iteration 350: error is 0.117268 (50 iterations in 0.00 seconds)
Iteration 400: error is 0.120770 (50 iterations in 0.00 seconds)
Iteration 450: error is 0.121062 (50 iterations in 0.00 seconds)
Iteration 500: error is 0.121366 (50 iterations in 0.00 seconds)
Iteration 550: error is 0.121098 (50 iterations in 0.00 seconds)
Iteration 600: error is 0.121540 (50 iterations in 0.00 seconds)
Iteration 650: error is 0.121057 (50 iterations in 0.00 seconds)
Iteration 700: error is 0.120856 (50 iterations in 0.00 seconds)
Iteration 750: error is 0.121666 (50 iterations in 0.00 seconds)
Iteration 800: error is 0.121161 (50 iterations in 0.00 seconds)
Iteration 850: error is 0.121708 (50 iterations in 0.00 seconds)
Iteration 900: error is 0.121865 (50 iterations in 0.00 seconds)
Iteration 950: error is 0.122631 (50 iterations in 0.00 seconds)
Iteration 999: error is 0.121577 (50 iterations in 0.00 seconds)
Fitting performed in 0.00 seconds.

Comparing to that MNIST test example slowly but progressed till the last iteration. And the IRIS dataset is a simple one - linearly separable

No I haven't tried other implementations yet

DmitryUlyanov commented 7 years ago

Well, it actually lies about the loss before 200 iteration (check the code), so I would not believe this log.

Did you try this one https://github.com/cemoody/topicsne? Interesting to compare to this repo and sk-learn both in quality and speed.

On Thu, Mar 9, 2017 at 10:44 PM areshytko notifications@github.com wrote:

Yes, unfortunately sk-learn's t-sne is unusable now except for such toy datasets. Yes, that's strange the output shows that the algorithm quickly converged to a low error and stopped any progress further.

Learning embedding... Iteration 50: error is 43.405481 (50 iterations in 0.00 seconds) Iteration 100: error is 44.709520 (50 iterations in 0.00 seconds) Iteration 150: error is 43.567784 (50 iterations in 0.00 seconds) Iteration 200: error is 42.564679 (50 iterations in 0.00 seconds) Iteration 250: error is 1.118502 (50 iterations in 0.00 seconds) Iteration 300: error is 0.238091 (50 iterations in 0.00 seconds) Iteration 350: error is 0.117268 (50 iterations in 0.00 seconds) Iteration 400: error is 0.120770 (50 iterations in 0.00 seconds) Iteration 450: error is 0.121062 (50 iterations in 0.00 seconds) Iteration 500: error is 0.121366 (50 iterations in 0.00 seconds) Iteration 550: error is 0.121098 (50 iterations in 0.00 seconds) Iteration 600: error is 0.121540 (50 iterations in 0.00 seconds) Iteration 650: error is 0.121057 (50 iterations in 0.00 seconds) Iteration 700: error is 0.120856 (50 iterations in 0.00 seconds) Iteration 750: error is 0.121666 (50 iterations in 0.00 seconds) Iteration 800: error is 0.121161 (50 iterations in 0.00 seconds) Iteration 850: error is 0.121708 (50 iterations in 0.00 seconds) Iteration 900: error is 0.121865 (50 iterations in 0.00 seconds) Iteration 950: error is 0.122631 (50 iterations in 0.00 seconds) Iteration 999: error is 0.121577 (50 iterations in 0.00 seconds) Fitting performed in 0.00 seconds.

No I haven't tried other implementations yet

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/DmitryUlyanov/Multicore-TSNE/issues/8#issuecomment-285459436, or mute the thread https://github.com/notifications/unsubscribe-auth/AGanZBRKxaN6HzTssMhc6gZKa9htno2lks5rkFaYgaJpZM4MYfCZ .

-- Best, Dmitry

shaidams64 commented 7 years ago

I also got a very different result from sklearn implementation on mnist dataset: Multi-core tsne screen shot 2017-07-13 at 10 39 43 am sklearn tsne screen shot 2017-07-13 at 10 39 53 am

DmitryUlyanov commented 7 years ago

Hi, the picture in the README file is t-sne visualization for MNIST dataset, made with the code from this repository. Here is the code https://github.com/DmitryUlyanov/Multicore-TSNE/blob/master/python/tests/test.py

shaidams64 commented 7 years ago

Hey, I loaded the dataset from sklearn and ran the multicore_tsne on it, would there be a difference? from MulticoreTSNE import MulticoreTSNE as MultiTSNE digits2 = load_digits() m_tsne = MultiTSNE(n_jobs=4, init='pca', random_state=0) m_y = m_tsne.fit_transform(digits2.data) plt.scatter(m_y[:, 0], m_y[:, 1], c=digits2.target) plt.show()

DmitryUlyanov commented 7 years ago

Do not know for sure, but the format the digits are stored can be different, e.g. [0,1] or 0...255. And t-SNE does a gradient descent, which may fail if the scaling and learning rates are wrong.

Try the example test.py from abouve, do you get a pretty image?

shaidams64 commented 7 years ago

Yes it works with your example. It appears the scalings are different for the datasets. The dataset from sklearn is 0...16 but the one in your example is [-1,1]. So is this version working only with normalized datasets?

bartimus9 commented 7 years ago

Thank you for putting this together, as it is the only multicore TSNE application I can get to successfully complete. However, my results are identical to shaidams64. I have an arcsinh transformed data set and I tried an implementation of this method in R (single core) and I get good results. Sklearn implementation (python) on the same data set returns a very similar result. This multi-core implementation works quickly, but produces an indiscernible cloud of points. I have carefully aligned all of the arguments I can, and the result is the same. Even when I set multicoreTSNE to use only one core, the result is the same (cloud of points). Any recommendations on how to fix this?

EDIT: This discussion thread ends with a multicore TSNE implementation that does reproduce my results with Sklearn and Rtsne. https://github.com/lvdmaaten/bhtsne/issues/18

YubinXie commented 6 years ago

Is this problem solved with this multi-core tsne?

Ryanglambert commented 6 years ago

I used this recently and didn't see a noticeable speed up :shrug:. This was on an AWS instance with 32 cores. I was hopeful.

On Sun, Apr 1, 2018 at 7:07 PM, Yubin notifications@github.com wrote:

Is this problem solved with this multi-core tsne?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/DmitryUlyanov/Multicore-TSNE/issues/8#issuecomment-377836941, or mute the thread https://github.com/notifications/unsubscribe-auth/ADmCPOssaXqNqY4UlOgSdXFbs1SzavjCks5tkYfKgaJpZM4MYfCZ .

-- Sincerely,

Ryan Lambert data scientist email: ryan.g.lambert@gmail.com

orihomie commented 5 years ago

Hi, facing same problem for now - results of sklearn tsne and yours differs on the same params

Yes it works with your example. It appears the scalings are different for the datasets. The dataset from sklearn is 0...16 but the one in your example is [-1,1]. So is this version working only with normalized datasets?

So, if I'm getting it right - data normalizing should help (to make results be about "same")?