lejon / TSne.jl

Julia port of L.J.P. van der Maaten and G.E. Hintons T-SNE visualisation technique.
Other
143 stars 25 forks source link

Add option for precomputed distance #14

Closed juliohm closed 6 years ago

juliohm commented 7 years ago

For many users, the distance is not Euclidean. It usually comes from a complicated (expensive) procedure. Could you please add an option like metric="precomputed" available in Python's sklearn TSNE? http://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

improbable-22 commented 6 years ago

Looking at @alyst's code, I start to think there is a squared missing somewhere.

Hbeta!() contains P[j] = exp(-beta * D[j]) where D ought to be distance squared (in t-SNE paper eq (1) ).

But D is worked out with dist=false in here:

pairwisedist(X::AbstractMatrix, dist::Bool) = dist ? X : pairwisedist(X, Euclidean())
pairwisedist(X::AbstractMatrix, dist::SemiMetric) = pairwise(dist, X') # use Distances

Should this default to SqEuclidean()?

alyst commented 6 years ago

@improbable22 Good catch, thanks! I will fix it shortly.

I guess we need to keep Euclidean() as the default and do ^2 of the D elements once inside tsne(), otherwise user-specified distances would have the same problem.