scverse / squidpy

Spatial Single Cell Analysis in Python
https://squidpy.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
440 stars 79 forks source link

dotplot visualization for ligrec #163

Closed giovp closed 3 years ago

giovp commented 3 years ago

similar tohow cellphonedb does it

image

AnnaChristina commented 3 years ago

@giovp example for ligrec output is in here correct?

https://github.com/theislab/squidpy/blob/master/notebooks/lig_rec_cellphonedb_comparison.ipynb

working on this today and want to make sure this works with the implemented output of the ligrec permutation test in squidpy.

giovp commented 3 years ago

I would use the function perm_test as in the notebook and recreate an output to be sure 100%. Let me know if there is any issue I'll reply asap

AnnaChristina commented 3 years ago

I just merged master into my scanpy plotting branch and try to execute the perm_test to generate an example for the DotPlot.

But I get the following error:

File "/Users/anna.schaar/phd/code/spatial-tools/spatial_tools/__init__.py", line 9, in <module>
    import spatial_tools.graph as graph

  File "/Users/anna.schaar/phd/code/spatial-tools/spatial_tools/graph/__init__.py", line 5, in <module>
    from .perm_test import perm_test

  File "/Users/anna.schaar/phd/code/spatial-tools/spatial_tools/graph/perm_test.py", line 2
    from __future__ import annotations
                                      ^
SyntaxError: future feature annotations is not defined

I only have requirements installed in my local environment and the cellphonedb repo to run the perm_test notebook. have you encountered this already?

giovp commented 3 years ago

ah now it's called squidpy! I would re clone the new repo and work from there. Lt me know if it works!

AnnaChristina commented 3 years ago

Ok will try this!

michalk8 commented 3 years ago

SyntaxError: future feature annotations is not defined

This relates to Python3.6 (i.e. it doesn't work there because of the type annotations like np.ndarray[np.uint64]). Question is: shall we still support 3.6? If yes, I will make this work.

AnnaChristina commented 3 years ago

yes realized this and switched to other python version. was mainly on the old one due to this with celphoneDB https://github.com/Teichlab/cellphonedb/issues/225

but just did clone the repo @michalk8 mentioned, so this should not affect anything we do.

giovp commented 3 years ago

SyntaxError: future feature annotations is not defined

This relates to Python3.6 (i.e. it doesn't work there because of the type annotations like np.ndarray[np.uint64]). Question is: shall we still support 3.6? If yes, I will make this work.

we can drop 3.6 I think

AnnaChristina commented 3 years ago

@giovp this is the current status regarding this.

Bildschirmfoto 2020-12-01 um 11 00 26

two questions:

currently I fill NaN pvalues with 0 because otherwise DotPlot cannot handle the dot size.

Additionally, the dot size legend is super limited in scanpy. It requires dot min and dot max to be between 0 and 1

I tried to implement this similar to the cellphonedb, so we would clip -log10 pvalues above 3. but this then only shows me the dotsize for 3 in the legend.

mean is currently not transformed as this would result in -inf values so not sure if this is what we should do. However means don't really show a different for the dot right now

AnnaChristina commented 3 years ago

the plot is now a example for cluster_1 '10GMP' on the x-axis and source-target on y

giovp commented 3 years ago

@giovp this is the current status regarding this.

Bildschirmfoto 2020-12-01 um 11 00 26

It looks already a great starting point !

two questions:

  • should we also -log10 transform pvalues?

yes, would be best

  • should we also log2 transform means?

this I',m not sure, I don't think a negative mean expression makes sense tbh, never understood why they have it in cellphone db

currently I fill NaN pvalues with 0 because otherwise DotPlot cannot handle the dot size.

makes sense

Additionally, the dot size legend is super limited in scanpy. It requires dot min and dot max to be between 0 and 1

can you point me to where this is? I could send a PR in scanpy to accommodate this if needed.

I tried to implement this similar to the cellphonedb, so we would clip -log10 pvalues above 3. but this then only shows me the dotsize for 3 in the legend.

this legend is indeed weird, not sure really

mean is currently not transformed as this would result in -inf values so not sure if this is what we should do. However means don't really show a different for the dot right now

Also a bit weird. Can you open a PR with the code you have now and a reproducible example so I could also have a look? For the mean(source, target) for instance, what's the density looks like ?

giovp commented 3 years ago

for PR, I would suggest you to branch from current master, and merge #171 right away into the branch, cause it's gonna have major changes and might save you resolving too many conflicts later

AnnaChristina commented 3 years ago

Additionally, the dot size legend is super limited in scanpy. It requires dot min and dot max to be between 0 and 1

can you point me to where this is? I could send a PR in scanpy to accommodate this if needed.

https://github.com/theislab/scanpy/blob/5533b644e796379fd146bf8e659fd49f92f718cd/scanpy/plotting/_dotplot.py#L667

I tried to implement this similar to the cellphonedb, so we would clip -log10 pvalues above 3. but this then only >>shows me the dotsize for 3 in the legend.

this legend is indeed weird, not sure really

this is again due to this dot size requirement between 0 and 1, see here: https://github.com/theislab/scanpy/blob/5533b644e796379fd146bf8e659fd49f92f718cd/scanpy/plotting/_dotplot.py#L412

I did a hacky way for making this work so it shows an integer here which matches the value for -log10 transformed pvalues but due to this step range it will only show now 3. I thin this would also be resolved if scanpy allows more flexibility wrt dot min and dot max.

AnnaChristina commented 3 years ago

yes branch is from master and I am currently only developing this in my notebook to ensure no other changes affect anything here :+1:

will open the PR in a second

giovp commented 3 years ago

@AnnaChristina just talked fo fidel who wrote this, he said there shouldn't be a problem to change that. But are you sure you are looking at the dotplot class https://github.com/theislab/scanpy/blob/master/scanpy/plotting/_dotplot.py#L24-L778 that allows more flexibility than the function I think

giovp commented 3 years ago

also, I think I know why the mean expression did not change in plotting, has to do with the range. I feel by adding log2 it could show improved dynamic range

AnnaChristina commented 3 years ago

I think there are two ways we can implement this, either just calling the DotPlot class of scanpy (which @giovp mentions above) or write a class that inherits from DotPlot and changes relevant attributes/functions.

However this dot min/max issue arises at the plotting function itself: https://github.com/theislab/scanpy/blob/5533b644e796379fd146bf8e659fd49f92f718cd/scanpy/plotting/_dotplot.py#L667-L673

changing the legend should be fine if I just overwrite in our new class the _plot_size_legend function with something that can handle our requirements https://github.com/theislab/scanpy/blob/5533b644e796379fd146bf8e659fd49f92f718cd/scanpy/plotting/_dotplot.py#L408

ideally I would like something that can handle any dot max values and that I can set my steps in the size legend myself. the latter would work, but the first would currently require that I ovewrite the DotPlot._dotplot function which makes it completely redundant to use scanpy.DotPlot in the first place

but looking forward to fidels input regarding this!

giovp commented 3 years ago

@fidelram here's the issue that we were discussing before on slack. @AnnaChristina has been leading this, in the notebook (at the end) you can find an implementation of the dotplot that we'd like to wrap. I think really the obstacle now is the dot_max and dot_min setting, do you have any pointer on how to go about this? thank you in advance for the help!

fidelram commented 3 years ago

my recommendation is to write a class that inherits DotPlot as this is what I have done myself for other plots. The dot min/max should not be a problem as this is a number from 0 to 1 that can be adjusted to any case or simply not used

When DotPlot class is inherited two dataframes need to be created: one with the color values and other with the size scaled from 0 to 1 (as usually this is a fraction). The legend can be easily replaced by writing a different legend method.

The DotPlot class makes a grid where additional information can be added besides the dotplot itself and facilitates swapping the axes.

If only the dot plot is needed without any extra functionality, the _dotplot function is maybe simpler to use. I left that function as static precisely for situations that do not require all the extra functionality provided by the class.

I will be happy to assist if needed. Just maybe ping me on slack as I already receive too much info from git.