scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.91k stars 601 forks source link

scanpy.pl.stacked_violin - remove "marker label" whitespace if var_names is a list of genes #3320

Open adkinsrs opened 2 days ago

adkinsrs commented 2 days ago

Please make sure these conditions are met

What happened?

Basically I am creating a stacked violin plot that uses a list of marker genes for the "var_names" argument. But whenever I create the plot, it has extra whitespace at the top of the plot where the marker gene labels should go. This is very evident when you add a figure title, which gets place above the padding whitespace. I have not really found a way around this, and am currently looking through the scanpy internal code to see if there is some padding setting that I can undo. If there is a workaround to this or some option that I am missing I would like to know.

Thanks!

Minimal code sample

# I have an AnnData object that has undergone the Seurat analysis. I named the Leiden clustering output "spatial_clusters" since I was testing a spatial dataset read in via spatialdata then converted to AnnData with "spatialdata_io.experimental.to_legacy_anndata"

marker_genes = ["Pou4f3", "Calb2", "Pvalb", "Smpx", "Mlf1", "Sox2"]  # 5 random Cochlear HCs P7 + Sox2

sc.pl.stacked_violin(adata, marker_genes, title="Marker gene expression per cluster", groupby="spatial_clusters", cmap="YlOrRd", show=False, return_fig=True)

### COMPARISON TO MARKER GENES WITH LABELS
marker_genes = {"IHC": ["Pou4f3", "Calb2", "Pvalb", "Smpx", "Mlf1"], "Random": ["Sox2"]}

sc.pl.stacked_violin(adata, marker_genes, title="Marker gene expression per cluster", groupby="spatial_clusters", cmap="YlOrRd", show=False, return_fig=True)

Error output

Will post in the next comment on this thread. Seems I cannot drag-n-drop images into this block.

Versions

I am including relevant package versions. Can provide more if needed Python 3.12.7 ``` anndata==0.10.6 matplotlib==3.9.0 pandas==2.2.1 scanpy-1.10.3 ```
adkinsrs commented 2 days ago

Error Output

Output when using a list of marker genes in the var_names arg Image

Output when using a dict of marker genes in the var_names arg Image

flying-sheep commented 2 days ago

Yeah, that’s true.

I think depending on what’s happening, it’s actually the spacer we add, not the area for the dendrogram, but sometimes we also need that spacer …

Our plotting code is complicated and needs to be overhauled. If anyone feels like diving int

adkinsrs commented 1 day ago

I was able to "hack" my way to a solution. However I needed to call StackedViolin.make_figure to actually generate the figure that I could remove Axes from, and this creates the issue of showing the plot twice (one incorrect, one correct) in a Jupyter notebook. In a script though this shouldn't be an issue though.


marker_genes = ["Pou4f3", "Calb2", "Pvalb", "Smpx", "Mlf1", "Sox2"]  # 5 random Cochlear HCs P7 + Sox2

violin_fig = sc.pl.stacked_violin(vis_adata, marker_genes, var_group_positions=None, title="Marker gene expression per cluster", groupby="spatial_clusters", cmap="YlOrRd", show=False, return_fig=True)

# Remove the existing legend and add a new vertically-oriented one
violin_fig.legend(show = False)
violin_fig.add_totals()
violin_fig.make_figure()

# For some reason, deleting all axes and remaking the figure makes it without the spacer above the plot (which was in ax[2] I think)
violin_axes = violin_fig.fig.get_axes()

for ax in violin_axes:
    violin_fig.fig.delaxes(ax)
violin_fig.make_figure()

Image

flying-sheep commented 1 day ago

very cool! Happy you found a workaround. I’d really like to toss a lot of the plotting out and replace it with something declarative …