pavlin-policar / openTSNE

Extensible, parallel implementations of t-SNE
https://opentsne.rtfd.io
BSD 3-Clause "New" or "Revised" License
1.42k stars 157 forks source link

Test failure on i386 #247

Open utkarsh2102 opened 10 months ago

utkarsh2102 commented 10 months ago

Hey, thanks for working on this, @pavlin-policar. However, when building the package in i386, one of the tests fails:

=================================== FAILURES ===================================
_________ TestTSNECorrectnessUsingPrecomputedDistanceMatrix.test_iris __________

self = <tests.test_correctness.TestTSNECorrectnessUsingPrecomputedDistanceMatrix
testMethod=test_iris>

    def test_iris(self):
        x = datasets.load_iris().data
        x += np.random.normal(0, 1e-3, x.shape)  # iris contains duplicate rows

        distances = squareform(pdist(x))
        params = dict(initialization="random", random_state=0)
        embedding1 = TSNE(metric="precomputed", **params).fit(distances)
        embedding2 = TSNE(metric="euclidean", **params).fit(x)

>       np.testing.assert_almost_equal(embedding1, embedding2)
E       AssertionError:
E       Arrays are not almost equal to 7 decimals
E
E       Mismatched elements: 300 / 300 (100%)
E       Max absolute difference: 3.4794089
E       Max relative difference: 19.31117662
E        x: TSNEEmbedding([[ -8.8343696,  14.5389743],
E                      [ -6.5431559,  15.9090961],
E                      [ -7.3407737,  16.6576344],...
E        y: TSNEEmbedding([[ -8.8276681,  14.0647152],
E                      [ -6.1690237,  14.5453917],
E                      [ -6.8637968,  15.4036782],...

tests/test_correctness.py:299: AssertionError
=========================== short test summary info ============================
FAILED tests/test_correctness.py::TestTSNECorrectnessUsingPrecomputedDistanceMatrix::test_iris
utkarsh2102 commented 10 months ago

Oh it also fails on Ubuntu ppc64el. Here are the logs:

https://launchpadlibrarian.net/682040467/buildlog_ubuntu-mantic-ppc64el.opentsne_0.6.2-1_BUILDING.txt.gz

pavlin-policar commented 10 months ago

That is interesting. Unfortunately, I don't know how I could debug this, as I do not have access to these platforms. Are there actually any visual differences when looking at these embeddings? I know this is a test run with no visuals, but have you tried looking at the difference between passing in a precomputed matrix and or letting openTSNE take care of that?

picca commented 10 months ago

Hello, I have uploaded the 1.0.0 version in Debian and yes the problem is still present.

https://buildd.debian.org/status/logs.php?pkg=opentsne&ver=1.0.0-1&suite=sid

This is really strange arm 32bit arch are passing the tests. This seems really i386 related...

There is specific code for x86 cpu ?

I removed the notiv flag during the compilation since this is a violation of the Debian policy.

on i386 we do not support SSE instructions.

https://wiki.debian.org/ArchitectureSpecificsMemo#i386-1

Cheers

pavlin-policar commented 9 months ago

This could potentially have something to do with the Annoy library. I believe it has some AVX instructions and such, howver, I really don't know this code at all and wouldn't even know where to begin trying to fix it.

However, it seems that maybe they have taken some care to add i386: https://github.com/spotify/annoy/pull/207/files

So maybe it's not that? I generally tried to copy their setup pipeline into https://github.com/pavlin-policar/openTSNE/blob/master/setup.py as closely as possible, but perhaps I've messed something up?

As I've said, I have no way to reproduce this and I am not familiar with the platform-specific compiler directives, so any help on this would be greatly appreciated.

dkobak commented 9 months ago

Just to clarify: is this the ONLY test that fails? I don't think this rest relies on Annoy: it uses the Iris data which is so small that openTSNE should use exact NN.

pavlin-policar commented 9 months ago

Sorry, I totally missed the context. Only one test is failing! I have no idea what could be causing this.

Could it be due to accumulating rounding errors or something? The actual embedding coordinates do seem to be ranked similarly. Or could it be that scipy computes distances a bit differently to scikit-learn on i386?

dkobak commented 8 months ago

Or could it be that scipy computes distances a bit differently to scikit-learn on i386?

No idea, but it does seem like the problem may be upstream. Would be interesting to compare sklearn NearestNeighbors results on X vs on pdist(X) with metric=precomputed, to see if there is indeed any difference. Is there any way to run this on i386 in a virtual machine? Otherwise, @utkarsh2102 @picca do you want to help debugging this?

picca commented 8 months ago

What about adding this test in the unit test suite, then I will be able to make it run on all our architectures.

We can use multiple release cycle in order to check this.

I do not have lot's of time myself for this, but, by packaging a OpenTSNE which contain this test is something possible for me.

for i386, maybe you can use qemu ?

dkobak commented 8 months ago

@picca I was just wondering if you could run the following snippet in your Python environment on i386:

import numpy as np
from sklearn.datasets import load_iris
from sklearn.neighbors import NearestNeighbors
from scipy.spatial.distance import pdist, squareform

X = load_iris()['data']
np.random.seed(42)
X += np.random.randn(*X.shape) / 10

nn = NearestNeighbors(metric="euclidean").fit(X)
distances1, indices1 = nn.kneighbors(n_neighbors=90)

nn = NearestNeighbors(metric="precomputed").fit(squareform(pdist(X)))
distances2, indices2 = nn.kneighbors(n_neighbors=90)

assert(np.allclose(distances1, distances2))
assert(np.all(indices1 == indices2))

In my environment this passes.