curiosity-ai / umap-sharp

C# library for fast embeddings projection using Uniform Manifold Approximation and Projection
MIT License
36 stars 5 forks source link

DensMAP #10

Open lofcz opened 10 months ago

lofcz commented 10 months ago

The library is great (as usual with your work). I wonder whether there are plans to augment the implementation with DensMAP support to preserve local data density. This is critical if we are to use the reduced results for clusterization - with UMAP the same embedding (for example from ada-002) is placed a tad differently in the reduced dimensionality every time:

image numberOfNeighbors = 15, K-Means++ clustering

This can be, to an extent mitigated with hyperparameter fine-tuning: image numberOfNeighbors = 80, DBSCAN clustering

DensMAP is an extension and if I'm not missing something, seems to be implemented only in two files: