Open colinmegill opened 5 years ago
A note: this issue is a 'stub', and can be used for the more general epic 'exploring var data in cellxgene'
A few thoughts....
UI research work done by Stamen for Banfield lab:
Other sources considered in the design of this feature:
If not gene * gene
, but instead cell * gene
(!), a track plot a brilliant option, so long as we can scale it with the number of cells... https://chanzuckerberg.github.io/scRNA-python-workshop/analysis/05-diffexp.html:
@ambrosejcarr also brings up that in all cases (except perfectly linear distribution), it will be useful to have a control that allows users to change the upper and lower limits of the colorscale to reveal substructure by color across all values — doing this live would offer a very powerful interactive speedup to something scientists do a lot.
See 'adjusting contrast limits' here: https://napari.org/tutorials/image.html
See example of live contrast limits. Use your imagination and think of the bright spot and dark spot as clusters you want to see hidden substructure in. :-)
Clustergrammer has some nice patterns:
https://www.youtube.com/watch?v=eGDZA-xm_oc https://github.com/ismms-himc/visium-clustergrammer2clu
+1 For us as well
Heatmap display with cells as columns and genes as rows, or vice versa.
The more research I do on this feature, the more I realize how powerful it's going to be to hook it up to the crossfilter :)
+1
+1 for interactive heatmaps :) @colinmegill @murraybrad13 we made this example on Observable (using https://github.com/ismms-himc/clustergrammer-gl) where we allow users to run enrichment analysis on a selected subset of genes (via clicking on dendrogram or zooming and clicking heatmap):
this example shows you users can select a cluster of genes and start to dig into the functions of the set of genes using prior knowledge
Notes 1-7
Notes 8-10
A short list of design constraints (to avoid):
Cross-posting a story from Ben Humphreys from #96. I believe this is actually a heatmap use case:
Another idea - would be SUPER COOL if I could select two cell types that I know are adjecent to one another in the kidney, then click a button such that one cluster then shows all receptors it expresses, the other cluster all the ligands it expresses - so I could get at intercellular communication…that would be amazing…
Cross-posting a story from Ben Humphreys from #96. I believe this is actually a heatmap use case:
Another idea - would be SUPER COOL if I could select two cell types that I know are adjecent to one another in the kidney, then click a button such that one cluster then shows all receptors it expresses, the other cluster all the ligands it expresses - so I could get at intercellular communication…that would be amazing…
FYI @ambrosejcarr we did something similar with potential ligand-receptor interactions in Fernandez et al 2019 (https://www.nature.com/articles/s41591-019-0590-4) using our interactive heatmap (https://github.com/giannarelli-lab/Single-Cell-Immune-Profiling-of-Atherosclerotic-Plaques#30-ligand-receptor-sym-vs-asym-differential-regulation)
@cornhundred That looks right. Thanks for the example!
@ambrosejcarr no problem, happy to share ideas and potentially code :)
@colinmegill following up from discussion yesterday (08/26/2020)
For both DotPlot and Heatmap view, it would be helpful to group cells based on 1 piece of categorical metadata (e.g. tissue compartment) and then sort within based on another (e.g. where the cell is found along the proximal to distal axis in the lung). Example below is from Figure 5a in https://www.biorxiv.org/content/10.1101/742320v2.full.pdf. Displaying cell types and genes without expression is also helpful for keeping the figure organization the same across different gene sets. Also, in addition to organizing the genes based on expression similarity (with some sort of hierarchical clustering), it would be useful to allow organization of genes based on metadata (in the example below, hormone receptors are organized by the type of hormones they interact with). Gene metadata could be included in the supplied AnnData object (under the .var shelf) or uploaded with the gene list (as a TSV).
For DotPlots specifically, think default behavior should be for the mean expression to be calculated without 0s (to make it independent from percentage of cells it's detected in from each type). I could also imagine this being a toggle though.
FYI more WebGL heatmap development by vitessce https://github.com/hubmapconsortium/vitessce/issues/656. I'm playing WebGL heatmap matchmaker
Dot plot legend:
You can also check scanpy dotplot legend here:
https://scanpy-tutorials.readthedocs.io/en/latest/plotting/core.html#dotplot
and
https://scanpy.readthedocs.io/en/stable/_images/scanpy.pl.dotplot.png (two different styles are shown here)
and
Few things related to "what color represents" discussion we had:
1) mean_only_expressed
argument (https://scanpy.readthedocs.io/en/stable/api/scanpy.pl.dotplot.html) determines how the mean is calculated (with or without zeros).
2) See values_to_plot
argument here: https://scanpy-tutorials.readthedocs.io/en/latest/plotting/core.html#Visualize-marker-genes-using-dotplot which can take values like logfoldchanges etc. (sorry this argument is not well-documented right now due to a bug in the documentation tool I guess), it was supposed to be here https://scanpy.readthedocs.io/en/latest/api/scanpy.pl.rank_genes_groups_dotplot.html)
Clever example of an interactive heatmap with use of multiple rows of metadata to define columns and zooming and search functionality: https://lungmap.net/breath-omics-experiment-page/?experimentTypeId=LMXT0000000016&experimentId=LMEX0000004388&analysisId=LMAN0000000344&view=signatureList
The overall idea is to allow toggling between the 'cells' view (umap/tsne) with a 'genes' view (heatmap probably) so that cellxgene users can get a sense of the overall shape of the dataset from a genes perspective, see how they cluster, and drill down and select them.
A couple very rough mocks just to get the idea across:
Filing this issue so we can begin to reference it from other issues. Thanks to @ambrosejcarr for spurring this idea forward in conversations & mocks.