maximilianh / cellBrowser

main repo: https://github.com/ucscGenomeBrowser/cellBrowser/ - Python pipeline and Javascript scatter plot library for single-cell datasets, http://cellbrowser.rtfd.org
https://github.com/ucscGenomeBrowser/cellBrowser/
GNU General Public License v3.0
102 stars 40 forks source link

display gene table and heatmap #26

Open slowkow opened 5 years ago

slowkow commented 5 years ago

Right now the table of genes with p-values is only visible after clicking on the name of a cluster.

This means you have to choose which display you want to see, either the map of cells or the table of genes, but not both at the same time.

You might consider an alternative display, as in the Loupe software by 10X Genomics.

Here are two screenshots that show how they do it:

image

image

I like the interactive table:

What do you think?

maximilianh commented 5 years ago

This is an excellent idea. I use Loupe but for some reason never had the idea of copying this layout. I can't promise that I'll do it right away, but the next time when I'm playing with the UI, I'll certainly implement this. Thanks!

maximilianh commented 5 years ago

still didn't get to do this. other users have requested violin plots of selected marker genes on the clusters rather than a heatmap. Opinion?

slowkow commented 5 years ago

A violin is better than nothing, but I have opinions after seeing many figures.

Violins are not appropriate for sparse data -- most cells have 0 expression for most genes. Violins show a weird blob at the bottom, creating a false impression of low levels of expression.

For this reason, I like to show a bar that shows how many cells are nonzero, giving the additional information absent from the violin.

I use geom_quasirandom() from the ggbeeswarm package to show only the non-zero values, ignoring all of the zeros that are shown by the bar.

A heatmap where each row is a gene and each column is a cell is a valuable way to see subclusters that are not represented by a 2D map (tSNE, UMAP, PCA, etc.). It's also valuable to see correlated genes to your query gene.

maximilianh commented 5 years ago

thanks, very nice idea to remove the 0s!!

oopps... every column is a cell? that won't be easy, most of the more recent 10x datasets have >30k cells now...

I was thinking of doing a heatmap cluster x genes...

On Fri, Mar 1, 2019 at 5:48 PM Kamil Slowikowski notifications@github.com wrote:

A violin is better than nothing, but I have opinions after seeing many figures.

Violins are not appropriate for sparse data -- most cells have 0 expression for most genes. Violins show a weird blob at the bottom, creating a false impression of low levels of expression.

For this reason, I like to show a bar that shows how many cells are nonzero, giving the additional information absent from the violin.

I use geom_quasirandom() from the ggbeeswarm https://github.com/eclarke/ggbeeswarm package to show only the non-zero values, ignoring all of the zeros that are shown by the bar.

https://user-images.githubusercontent.com/209714/53652403-cf841e00-3c16-11e9-8308-e03fd08ae04c.png

A heatmap where each row is a gene and each column is a cell is a valuable way to see subclusters that are not represented by a 2D map (tSNE, UMAP, PCA, etc.). It's also valuable to see correlated genes to your query gene.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/issues/26#issuecomment-468731023, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS-TRc4i1qDhZtTipHrZOYOAfxP4PByks5vSVnUgaJpZM4WwZNS .

slowkow commented 5 years ago

You're right that showing cells only works for small data. A heatmap of clusters (columns) and genes (rows) would be great, too.