jlmelville / uwot

An R package implementing the UMAP dimensionality reduction method.
https://jlmelville.github.io/uwot/
GNU General Public License v3.0
315 stars 31 forks source link

UMAP trustworthiness and continuity? #122

Open davidechicco opened 5 months ago

davidechicco commented 5 months ago

Hi Thanks for having developed and released the uwot R package. Is there a function that can express trustworthiness and continuity of the UMAP results?

Best regards,

-- Davide Chicco

jlmelville commented 5 months ago

Unfortunately, there are no trustworthiness and continuity functions in uwot. I wrote some evaluation functions at https://github.com/jlmelville/quadra but I am not sure how valuable they are.

Personally, I find that evaluating the nearest neighbor preservation at a handful of nearest neighbor values (e.g. 15, 50, 150) works well enough for me. If you are prepared to install https://github.com/jlmelville/rnndescent then it's not too hard to evaluate the overlap at whatever value of n_neighbors you used e.g.:

library(uwot)
library(rnndescent)

n_neighbors <- 10
res <- umap(iris, ret_nn = TRUE, n_neighbors = n_neighbors)
high_dim_nn <- res$nn$euclidean
# brute_force_knn and neighbor_overlap both come from rnndescent
low_dim_nn <-
  brute_force_knn(res$embedding, k = n_neighbors, n_threads = 6)
neighbor_overlap(high_dim_nn, low_dim_nn)