chanzuckerberg / cellxgene

An interactive explorer for single-cell transcriptomics data
https://chanzuckerberg.github.io/cellxgene/
MIT License
614 stars 116 forks source link

Genes as primary view (heatmap) #632

Open colinmegill opened 5 years ago

colinmegill commented 5 years ago

The overall idea is to allow toggling between the 'cells' view (umap/tsne) with a 'genes' view (heatmap probably) so that cellxgene users can get a sense of the overall shape of the dataset from a genes perspective, see how they cluster, and drill down and select them.

A couple very rough mocks just to get the idea across:

image

image

Filing this issue so we can begin to reference it from other issues. Thanks to @ambrosejcarr for spurring this idea forward in conversations & mocks.

colinmegill commented 5 years ago

A note: this issue is a 'stub', and can be used for the more general epic 'exploring var data in cellxgene'

neuromusic commented 5 years ago

A few thoughts....

colinmegill commented 5 years ago

UI research work done by Stamen for Banfield lab:

Other sources considered in the design of this feature:

  1. https://hicexplorer.readthedocs.io/en/latest/content/example_usage.html
  2. http://chorogenome.ie-freiburg.mpg.de:5001/#browser/a
  3. https://hicexplorer.readthedocs.io/en/latest/content/tools/hicPlotMatrix.html#hicplotmatrix
  4. https://epilogos.altius.org/?application=viewer&sampleSet=vA&mode=single&genome=hg19&model=15&complexity=KL&group=all&chrLeft=chr1&chrRight=chrY&start=20&stop=59373546
  5. https://github.com/linnarsson-lab/loom-viewer
  6. http://loom.linnarssonlab.org/dataset/heatmap/osmFISH/osmFISH_SScortex_mouse_all_cells.loom/NrBEoXQGmAGHgEYq2kqi3IExagZjwBYI0R58oA7AVwBs6UMnt5FZ5gB2J1GADl6leKaG2FhICALSwAdB3ywAbAFZ2ATmyrsWjcv5dVKOfi6JE2IlyK3EO_vlwEoRKNiaISMTuj7pMGBw8QjI4JloGTxY2Dh8hePhUMQTxNEpIxkpgTOp6Ri8MDTyo3Nz4XIsCCvySxkqPbXccZTq2zJTMCCA
  7. http://celltypes.brain-map.org/rnaseq/mouse/cortex-and-hippocampus
  8. https://higlass.io/
  9. https://bl.ocks.org/mostaphaRoudsari/0e5518ec336f16a1559b
  10. https://bl.ocks.org/mostaphaRoudsari/0e5518ec336f16a1559b
  11. https://github.com/raivokolde/pheatmap
  12. http://circos.ca/intro/published_images/
  13. https://www.cytosplore.org/
  14. http://seaborn.pydata.org/generated/seaborn.clustermap.html
  15. https://twitter.com/officeofjane/status/1166601027036033024
  16. https://twitter.com/franschrandez/status/1225138722930339840
  17. https://twitter.com/torkelo/status/857201923392495617
  18. https://towardsdatascience.com/better-heatmaps-and-correlation-matrix-plots-in-python-41445d0f2bec
  19. https://jameshadfield.github.io/phandango/#/examples
  20. https://en.wikipedia.org/wiki/Heat_map#/media/File:Heatmap.png
  21. https://collection.cooperhewitt.org/objects/2318798835/
  22. http://higlass.io/app/?config=Q5LdNchQRLSZ_0yKsTEoiw
  23. https://www.researchgate.net/figure/Heatmap-of-the-top-26-features-that-characterize-non-malaria-and-malaria-febrile_fig12_297583050
colinmegill commented 4 years ago

If not gene * gene, but instead cell * gene (!), a track plot a brilliant option, so long as we can scale it with the number of cells... https://chanzuckerberg.github.io/scRNA-python-workshop/analysis/05-diffexp.html:

image

colinmegill commented 4 years ago

@ambrosejcarr also brings up that in all cases (except perfectly linear distribution), it will be useful to have a control that allows users to change the upper and lower limits of the colorscale to reveal substructure by color across all values — doing this live would offer a very powerful interactive speedup to something scientists do a lot.

See 'adjusting contrast limits' here: https://napari.org/tutorials/image.html

ambrosejcarr commented 4 years ago

See example of live contrast limits. Use your imagination and think of the bright spot and dark spot as clusters you want to see hidden substructure in. :-) contrast_limits

colinmegill commented 4 years ago

Clustergrammer has some nice patterns:

https://www.youtube.com/watch?v=eGDZA-xm_oc https://github.com/ismms-himc/visium-clustergrammer2clu

image

murraybrad13 commented 4 years ago

+1 For us as well

murraybrad13 commented 4 years ago

Heatmap display with cells as columns and genes as rows, or vice versa.

colinmegill commented 4 years ago

The more research I do on this feature, the more I realize how powerful it's going to be to hook it up to the crossfilter :)

murraybrad13 commented 4 years ago

+1

cornhundred commented 4 years ago

+1 for interactive heatmaps :) @colinmegill @murraybrad13 we made this example on Observable (using https://github.com/ismms-himc/clustergrammer-gl) where we allow users to run enrichment analysis on a selected subset of genes (via clicking on dendrogram or zooming and clicking heatmap):

https://observablehq.com/@ismms-himc/covid-19-transcriptional-signature-tenoever-data-a549?collection=@ismms-himc/ismms-himc-covid-19

this example shows you users can select a cluster of genes and start to dig into the functions of the set of genes using prior knowledge

colinmegill commented 4 years ago

Notes 1-7 image

Notes 8-10 image

colinmegill commented 4 years ago

A short list of design constraints (to avoid):

    • [ ] Stretched, larger than necessary grid cells
    • [ ] Grid cells as interactive targets for pointer or touch
    • [ ] Padding and whitespace between square grid cells (though this is inevitable with trackplots and dotplots if we allow that as a toggle)
    • [ ] Small, unreadable font sizes
    • [ ] Larger than necessary font sizes
    • [ ] Full height continuous colormap legend
    • [ ] Squished, dense, inscrutable dendrograms
    • [ ] Indeterminate, irreproducible interactions and resulting states (brushing and zooming frequently yield these)
ambrosejcarr commented 4 years ago

Cross-posting a story from Ben Humphreys from #96. I believe this is actually a heatmap use case:

Another idea - would be SUPER COOL if I could select two cell types that I know are adjecent to one another in the kidney, then click a button such that one cluster then shows all receptors it expresses, the other cluster all the ligands it expresses - so I could get at intercellular communication…that would be amazing…

cornhundred commented 4 years ago

Cross-posting a story from Ben Humphreys from #96. I believe this is actually a heatmap use case:

Another idea - would be SUPER COOL if I could select two cell types that I know are adjecent to one another in the kidney, then click a button such that one cluster then shows all receptors it expresses, the other cluster all the ligands it expresses - so I could get at intercellular communication…that would be amazing…

FYI @ambrosejcarr we did something similar with potential ligand-receptor interactions in Fernandez et al 2019 (https://www.nature.com/articles/s41591-019-0590-4) using our interactive heatmap (https://github.com/giannarelli-lab/Single-Cell-Immune-Profiling-of-Atherosclerotic-Plaques#30-ligand-receptor-sym-vs-asym-differential-regulation)

lig-rec_sym-asym_diff_interactions

link to nbviewer https://nbviewer.jupyter.org/github/giannarelli-lab/Single-Cell-Immune-Profiling-of-Atherosclerotic-Plaques/blob/master/notebooks/3.0_Ligand-Receptor_Sym-vs-Asym_Differential_Regulation.ipynb?flush_cache=true

ambrosejcarr commented 4 years ago

@cornhundred That looks right. Thanks for the example!

cornhundred commented 4 years ago

@ambrosejcarr no problem, happy to share ideas and potentially code :)

ktravaglini commented 4 years ago

@colinmegill following up from discussion yesterday (08/26/2020)

For both DotPlot and Heatmap view, it would be helpful to group cells based on 1 piece of categorical metadata (e.g. tissue compartment) and then sort within based on another (e.g. where the cell is found along the proximal to distal axis in the lung). Example below is from Figure 5a in https://www.biorxiv.org/content/10.1101/742320v2.full.pdf. Displaying cell types and genes without expression is also helpful for keeping the figure organization the same across different gene sets. Also, in addition to organizing the genes based on expression similarity (with some sort of hierarchical clustering), it would be useful to allow organization of genes based on metadata (in the example below, hormone receptors are organized by the type of hormones they interact with). Gene metadata could be included in the supplied AnnData object (under the .var shelf) or uploaded with the gene list (as a TSV).

For DotPlots specifically, think default behavior should be for the mean expression to be calculated without 0s (to make it independent from percentage of cells it's detected in from each type). I could also imagine this being a toggle though.

Screen Shot 2020-08-27 at 2 43 16 PM
cornhundred commented 4 years ago

FYI more WebGL heatmap development by vitessce https://github.com/hubmapconsortium/vitessce/issues/656. I'm playing WebGL heatmap matchmaker

colinmegill commented 4 years ago

Dot plot legend:

https://github.com/satijalab/seurat/issues/2379

gokceneraslan commented 4 years ago

You can also check scanpy dotplot legend here:

https://scanpy-tutorials.readthedocs.io/en/latest/plotting/core.html#dotplot

and

https://scanpy.readthedocs.io/en/stable/_images/scanpy.pl.dotplot.png (two different styles are shown here)

and

https://scanpy-tutorials.readthedocs.io/en/latest/plotting/core.html#Visualize-marker-genes-using-dotplot

Few things related to "what color represents" discussion we had:

1) mean_only_expressed argument (https://scanpy.readthedocs.io/en/stable/api/scanpy.pl.dotplot.html) determines how the mean is calculated (with or without zeros).

2) See values_to_plot argument here: https://scanpy-tutorials.readthedocs.io/en/latest/plotting/core.html#Visualize-marker-genes-using-dotplot which can take values like logfoldchanges etc. (sorry this argument is not well-documented right now due to a bug in the documentation tool I guess), it was supposed to be here https://scanpy.readthedocs.io/en/latest/api/scanpy.pl.rank_genes_groups_dotplot.html)

signechambers1 commented 3 years ago

Clever example of an interactive heatmap with use of multiple rows of metadata to define columns and zooming and search functionality: https://lungmap.net/breath-omics-experiment-page/?experimentTypeId=LMXT0000000016&experimentId=LMEX0000004388&analysisId=LMAN0000000344&view=signatureList