maximilianh / cellBrowser

main repo: https://github.com/ucscGenomeBrowser/cellBrowser/ - Python pipeline and Javascript scatter plot library for single-cell datasets, http://cellbrowser.rtfd.org
https://github.com/ucscGenomeBrowser/cellBrowser/
GNU General Public License v3.0
105 stars 43 forks source link

display expression values in 2d bins #162

Open slowkow opened 4 years ago

slowkow commented 4 years ago

I'd like to ask if it would be possible to change the way the data is displayed in the main window.

As far as I know, there are no scRNA-seq data browsers that use 2d bins to show expression data. I think it might be worth a try.

This article from the documentation for the datashader python package does a great job showing why plotting colored dots is not optimal for large datasets.

https://datashader.org/user_guide/Plotting_Pitfalls.html

Visualization is supposed to help you explore and understand your data, but if your visualizations are systematically misrepresenting your data because of overplotting, oversaturation, undersampling, undersaturation, underutilized range, and nonuniform colormapping, then you won't be able to discover the real qualities of your data and will be unable to make the right decisions.

I agree with the issues raised in the article, and in my own experience I've found that it's easier to see expression patters when using 2d bins with the mean of all cells shown in each bin.

Here is the summary of the strategy recommended in the article:

These examples show some of the issues with various strategies for representing large numbers of dots in two dimensions:

image

An example of a figure that uses the strategy recommended in the article looks like this:

image

I might try to work on this, and I'll try to share if I make progress.

My first idea is to try the density heatmap plot from vega, but there might be other approaches worth trying, too.

image

slowkow commented 4 years ago

At this point I've hacked together my own javascript that builds on top of your foundation. I'm thoroughly impressed with your code.

Here are my attempts with d3 and vega. Of course I'd be happy to share the code. Right now I'm still playing around ā€” eventually I'll make a blog post or something.

CellBrowser

Here's a gene in CellBrowser:

image

d3

I got a prototype working with d3-hexbin, which shows the mean of the log2CPM expression value for cells in each bin:

image

vega

I got a rough prototype working with the vega heatmap. However, I don't know how to show the mean of log2CPM values. I posted a new question on Stackoverflow, and I hope someone might be able to suggest a workaround.

So instead, this is actually showing the density of points in each square ā€” weighted by the quantized expression values.

image

matthewspeir commented 4 years ago

Hey, @slowkow!

This is really cool and it seems like it would be a really great feature. We're currently focusing on bringing in new datasets, so I'm not sure we have any time to dedicate to this. If you have an idea for how to implement this on the python backend, we can certainly draw hexagons in Javascript. Basically, if you can implement this at least partially and then need help integrating it with the rest of our code, @maximilianh said he would be happy to help.

Thanks!

maximilianh commented 4 years ago

Hi @slowkow https://github.com/slowkow, just adding that D3 was too slow for this, at least in my hands, when I tried to draw many circles. The calculation of the x,y coordinates would probably too slow for Javascript and bigger datasets, I assume? So a python implementation may have advantages: the coords would be precalculated and then the javascript only would have to draw the hexagons. Hope this makes sense, let me know if we can help with something.

On Mon, Apr 6, 2020 at 9:30 PM Matt Speir notifications@github.com wrote:

Hey, @slowkow https://github.com/slowkow!

This is really cool and it seems like it would be a really great feature. We're currently focusing on bringing in new datasets, so I'm not sure we have any time to dedicate to this. If you have an idea for how to implement this on the python backend, we can certainly draw hexagons in Javascript. Basically, if you can implement this at least partially and then need help integrating it with the rest of our code, @maximilianh https://github.com/maximilianh said he would be happy to help.

Thanks!

ā€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/issues/162#issuecomment-609992191, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACL4TN2DXATBXT55ENFVATRLIUU5ANCNFSM4LQTKDGQ .

slowkow commented 4 years ago

Here's what I have been hacking on in the past few weeks. It seems to work pretty well. It is fun to build on top of the foundation that you built with the binary files and range requests šŸ˜€

Data from Smillie et al 2019

ezgif com-video-to-gif (2)

maximilianh commented 4 years ago

Hi Kamil, looks great. Do you have a link to this? Curious how you implemented it...

On Mon, Apr 13, 2020 at 9:11 PM Kamil Slowikowski notifications@github.com wrote:

Here's what I have been hacking on in the past few weeks. It seems to work pretty well. It is fun to build on top of the foundation that you built with the binary files and range requests šŸ˜€

Data from Smillie et al 2019 http://doi.org/dqdf

[image: ezgif com-video-to-gif (2)] https://user-images.githubusercontent.com/209714/79151556-a67e0880-7d98-11ea-93f3-baa939c6dc86.gif

ā€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/issues/162#issuecomment-613048254, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACL4TLHOV5ZPMPFCYSIRR3RMNPWFANCNFSM4LQTKDGQ .

slowkow commented 4 years ago

Hey Max, I apologize for the very long delay in response. Sometimes I forget to reply. Also, I was trying to decide for a long time whether I should contribute to your repo or create my own.

In the end, I made my own at https://github.com/slowkow/cellguide

I copied some of your files and then hacked new features until I had something that meets some of my needs. Of course, please feel free to copy anything ā€” I kept the GPL-3 license. I like that I have the freedom to diverge in a different direction with my own repo.

I have more ideas for the future if you want to chat again sometime ā€” let me know.