eamid / trimap

TriMap: Large-scale Dimensionality Reduction Using Triplets
Apache License 2.0
304 stars 20 forks source link

initial coordinates are exported as `NaN` or throws an error #23

Closed jlmelville closed 2 years ago

jlmelville commented 2 years ago

First, thanks for your work on this package and technique and making it available for others to study and experiment with.

The following example (using return_seq=True) raises a ValueError for me:

import trimap
from sklearn.datasets import load_digits

digits = load_digits()

embedding = trimap.TRIMAP(return_seq=True).fit_transform(digits.data, init="pca")

Omitting the init argument and letting it be the default None, allows the computation to finish, but the initial coordinates are exported as nan:

embedding = trimap.TRIMAP(return_seq=True).fit_transform(digits.data)

embedding[:, :, 0]
array([[nan, nan],
       [nan, nan],
       [nan, nan],
       ...,
       [nan, nan],
       [nan, nan],
       [nan, nan]])

I think the following:

https://github.com/eamid/trimap/blob/a7250f3225f54f0640f728f290fd09139e386eaf/trimap/trimap_.py#L591

should be:

    Y_all[:, :, 0] = Y

as Y_init may hold a string like "random" (which causes the ValueError) or None (hence the nans).

Happy to provide a PR for this if needed.

eamid commented 2 years ago

Thank you for catching this! Happy to merge the fix if you could send a PR :)