scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.9k stars 597 forks source link

Common plotting library for the Scanpy ecosystem #1832

Open grst opened 3 years ago

grst commented 3 years ago

I was wondering if plotting could be facilitated and made more consistent across the Scanpy ecosystem. I envisage a library ("scanpyplot" or whatever) that

Motivation:

ivirshup commented 3 years ago

I think there's definitely room for more plotting libraries in the ecosystem, but have some doubts about whether all needs can be met by one library. I personally use seaborn/ matplotlib, bokeh, datashader, and altair for different cases. I also think making a good plotting API is exceedingly difficult, especially if you target both high and low level use cases. I would note that the plotting code in scanpy feels like some of the most maintenance intensive code in the library.

provides helper functions for handling colors, saving figures, etc.

We can do a bit more of this here. But of course, much of it would end up being matplotlib specific.

encourages a consistent plotting API (e.g. by defining abstract base classes)

I'd be interested in hearing specific thoughts on this. I've personally been thinking it would be nice to lean on seaborn plotting classes more heavily here, potentially contributing features upstream. Here's one example https://github.com/mwaskom/seaborn/issues/2487 of a feature which could fit the AnnData data model nicely.

there is quite some duplicated code in the plotting section

We'd definitely like to reduce the amount of duplicated code, which is what drove the addition of sc.get. This seems to be working out internally, if slowly.

All the scanpy helper functions for plotting (e.g. savefig_or_show, _set_color_for_categorical_obs etc.) are private scanpy functions

I'd like to move towards stabilizing this. I'm not sure how much we'd want to provide plotting library specific code, vs. more generic helpers. Right now the most obvious addition is _set_color_for_categorical_obs, which I'd also like to make accessible through sc.get. Adding groupby support to anndata would help a lot here too (https://github.com/theislab/anndata/issues/556).

save_fig_or_show is something that I don't think we should export, and may need a rework (#1508).

grst commented 3 years ago

Hi @ivirshup,

thanks for your response! I agree that this can quickly get out of bounds, I'd thus suggest to

In brief all that is required to implement a plotting API that behaves like scanpy's.


I'd be interested in hearing specific thoughts on this. I've personally been thinking it would be nice to lean on seaborn plotting classes more heavily here, potentially contributing features upstream. Here's one example mwaskom/seaborn#2487 of a feature which could fit the AnnData data model nicely.

I was mostly referring to @fidelram's idea how to make plot styling more "modular" instead of having a vast amount of arguments for a single plotting function (#956). If this idea was to be implemented for all scanpy plotting functions, I thought that maybe an abstract base-class could provide the method signatures to ensure consistency within scanpy and ecosystem packages. Even with the current "keyword approach" it would be great if there was some way to ensure that common keywords are always named consistently.

What would be an example of a plot object you would like to "move" to seaborn? Something like a multi-panel UMAP plot?


I'd like to move towards stabilizing this. I'm not sure how much we'd want to provide plotting library specific code, vs. more generic helpers. Right now the most obvious addition is _set_color_for_categorical_obs, which I'd also like to make accessible through sc.get. Adding groupby support to anndata would help a lot here too (theislab/anndata#556).

that sounds great!


Finally, in terms of "reusable building blocks" I was thinking of, for instance,

ivirshup commented 2 years ago

Ping @WeilerP @adamgayoso, since you've both raised this idea today