lmcinnes / umap

Uniform Manifold Approximation and Projection
BSD 3-Clause "New" or "Revised" License
7.44k stars 808 forks source link

output umap clusters (label, features and cluster id) #1149

Open dgcovell opened 2 months ago

dgcovell commented 2 months ago

This is not a legitimate 'issue', rather a request for more information from umap results. I scanned the past issues and could find #1041 and #938 as somewhat relevant, but not exactly. If possible, using any example, except for the dynamic cases, could code be provided to generate umap clusters from the raw data. This should include all the labels and features (as well as latent space (coordinates??)). This may already be in the existing examples, but I am not finding it. If it is, please comment where.

Thanks, BTW, as you know, umap results represent strong competition for existing tensor flow utilities.

dgcovell commented 1 month ago

For example: umap.plot.points(mapper, labels=pendigits.target)

yields an embedding scatter plot. How do I associate these labels with mapper.embedding_ and export three columns to an excel file? So far I can get only the two column embeddings.

df = pd.DataFrame(mapper.embedding_) df.to_excel('mapper_embeddings.xlsx')