scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.81k stars 584 forks source link

Dotplot / Matrixplot Bug/Suggestion [Key Error] Because "var_group_labels" & "categories_order" using the same variable (memory), mostly happened when "swap_axes=True" #3081

Open WhatMelonGua opened 1 month ago

WhatMelonGua commented 1 month ago

Please make sure these conditions are met

What happened?

In fact, that's not a real bug report, but a suggestion about safe design to plot with scanpy.

It may happen when our celltype name is too long to show in the plotting figure. The problem happens here:

https://github.com/scverse/scanpy/blob/a20334f02e6f2a0b56dd6dd862b07d5bdd4d879e/scanpy/plotting/_baseplot_class.py#L1059-L1061

It cut off the string in functions dotplot/matrixplot(var_group_labels=) , as the function _plot_var_groups_brackets(group_labels=).

So, when we use code like the sample, the veryvery long labels will be cut off & we got an Error Because Var celltype_order is a list, and function _plot_var_groups_brackets (PATH: scanpy/plotting/_baseplot_class.py) will affect the string in list, then if affect the celltype_order itself, so

when called with the categories_order, it has been changed by _plot_var_groups_brackets, then the Error happened.

I suggest to add one copy, for parameter group_labels in function _plot_var_groups_brackets It may help someone are not so skillful on coding, to solve the problem maybe happen.

For example: add the code group_labels = copy.deepcopy(group_labels) at the top of function _plot_var_groups_brackets

Thank you very much for your attention

Minimal code sample

adata: any anndata
markers: gene list include in var_names
group: obs key
celltype_order = ['short', 'veryveryverylong_name', 'others', ...]

sc.pl.dotplot(
    adata, markers, group, show=False, swap_axes=True,
    categories_order=celltype_order, var_group_labels=celltype_order, var_group_positions=pos_markers,
)

Error output

KeyError: "['veryvery.'] not in index"
# (in fact the 'veryvery.' comes from the 'veryveryverylong_name' in celltype_order )

Versions

``` ```
flying-sheep commented 1 month ago

I can’t reproduce this, please provide a fully reproducible example that I can just paste and execute.