mehta-lab / VisCy

computer vision models for single-cell phenotyping
https://pypi.org/project/viscy/
BSD 3-Clause "New" or "Revised" License
29 stars 3 forks source link

Number of UMAP components #165

Open ziw-liu opened 2 weeks ago

ziw-liu commented 2 weeks ago
          @ziw-liu , I have been using 2 UMAP components for plotting and 4 components when I compute correlation with computed features. Does this mean it will change the interpretation of results?

_Originally posted by @Soorya19Pradeep in https://github.com/mehta-lab/VisCy/pull/153#discussion_r1755402256_

ziw-liu commented 2 weeks ago

2, 3, and 4 components UMAPs are being used in different places. My intuition is that we should only use 2 components for plotting, and do all the correlations in PC space for better interpretability.

ziw-liu commented 2 weeks ago

cc @alishbaimran @mattersoflight

mattersoflight commented 2 weeks ago

@ziw-liu @Soorya19Pradeep @alishbaimran I agree with Ziwen - use UMAP for plotting/visualization and PCA for computing correlations with engineered features. We should do 3D UMAP renderings in napari, like in zebrahub paper: https://github.com/royerlab/zebrahub-paper-umap-3d. So # of UMAP components can be 2 or 3. When computing PCA, use the number of components that explain 95% variance of embeddings.