saehm / DruidJS

A JavaScript Library for Dimensionality Reduction
107 stars 9 forks source link

UMAP metrics? #25

Closed Fil closed 2 years ago

Fil commented 3 years ago

I was trying to use the jaccard distance as a metric on UMAP, but it seems that it knows only euclidian_squared and "precomputed". Is it something you'd be willing to add?

function jaccard(a, b) {
  let c = 0, n = 0;
  for (let i = 0, l = a.length; i < l; ++i) {
    c += a[i] && b[i];
    n += a[i] || b[i]
  }
  return 1 - c / n;
}
saehm commented 3 years ago

If you compute the distance matrix by yourself with the jaccard metric then you can use it, but it does not work right, because there is the gradient from the (reduced?) eucliden distance hard coded. To work properly we would also need a function computing the gradient from the metric. I hope i did not get something wrong.

saehm commented 2 years ago

Oh... Sorry, i think i have misunderstood your question. The line you are pointing out is wrong, it is the distance of the embedded points during the optimization. I had to made this changes, to use different metrics. It should work now with version 0.3.16. I translated also some metrics from the original UMAP implementation (jaccard, sokal_michener, hamming, and yule).

Fil commented 2 years ago

Yes it seems to be working:

new druid["UMAP"](data, /n_neighbors/ 15, /local_connectivity / 10, /min_dist/ 0.1, /d/ 2, /metric/ druid.jaccard, /seed/ 42)

thanks!