scverse / spatialdata-plot

Static plotting for spatialdata
BSD 3-Clause "New" or "Revised" License
27 stars 13 forks source link

Problem with indices vs positional indices in shapes #169

Open LucaMarconato opened 11 months ago

LucaMarconato commented 11 months ago

I have just noticed that the squidpy notebook can't make the last plot; it should be a quick fix. The latest version of the notebook (gonna be merge in main soon) is this one: https://github.com/scverse/spatialdata-notebooks/blob/fix/remove_napari/notebooks/examples/squidpy_integration.ipynb

The dataset used in the notebook is the xenium rep 1 data, it should be unchanged from the one we have downloaded and the S3 version should be up-to-date (I will re-enable the upload functions today/tomorrow, so for sure it will be up-to-date soon).

The problem seem to be due to the use of positional indices where ".loc" indices should have been used. Please see the screenshot: the cell_id and gdf.index columns match, but it seems that at some point either a positional index for the table, either the table.obs.index has been used.

image

Faulty line:

sdata.pl.render_shapes(element="cell_circles", color="leiden").pl.show()

Traceback

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[7], line 1
----> 1 sdata.pl.render_shapes(element="cell_circles", color="leiden").pl.show()

File ~/embl/projects/basel/spatialdata-plot/src/spatialdata_plot/pl/basic.py:673, in PlotAccessor.show(self, coordinate_systems, legend_fontsize, legend_fontweight, legend_loc, legend_fontoutline, na_in_legend, colorbar, wspace, hspace, ncols, frameon, figsize, dpi, fig, title, share_extent, pad_extent, ax, return_ax, save)
    662     _render_images(
    663         sdata=sdata,
    664         render_params=params,
   (...)
    670         # extent=extent[cs],
    671     )
    672 elif cmd == "render_shapes" and cs_contents.query(f"cs == '{cs}'")["has_shapes"][0]:
--> 673     _render_shapes(
    674         sdata=sdata,
    675         render_params=params,
    676         coordinate_system=cs,
    677         ax=ax,
    678         fig_params=fig_params,
    679         scalebar_params=scalebar_params,
    680         legend_params=legend_params,
    681     )
    683 elif cmd == "render_points" and cs_contents.query(f"cs == '{cs}'")["has_points"][0]:
    684     _render_points(
    685         sdata=sdata,
    686         render_params=params,
   (...)
    691         legend_params=legend_params,
    692     )

File ~/embl/projects/basel/spatialdata-plot/src/spatialdata_plot/pl/render.py:121, in _render_shapes(sdata, render_params, coordinate_system, ax, fig_params, scalebar_params, legend_params)
    118     color_vector = color_vector[mask]
    119 shapes = gpd.GeoDataFrame(shapes, geometry="geometry")
--> 121 _cax = _get_collection_shape(
    122     shapes=shapes,
    123     s=render_params.scale,
    124     c=color_vector,
    125     render_params=render_params,
    126     rasterized=sc_settings._vector_friendly,
    127     cmap=render_params.cmap_params.cmap,
    128     norm=norm,
    129     fill_alpha=render_params.fill_alpha,
    130     outline_alpha=render_params.outline_alpha
    131     # **kwargs,
    132 )
    134 # Sets the limits of the colorbar to the values instead of [0, 1]
    135 if not norm and not values_are_categorical:

File ~/embl/projects/basel/spatialdata-plot/src/spatialdata_plot/pl/utils.py:248, in _get_collection_shape(shapes, c, s, norm, render_params, fill_alpha, outline_alpha, **kwargs)
    246     scaled_coords = [(centroid + (np.array(coord) - centroid) * s).tolist() for coord in geom.exterior.coords]
    247     row["geometry"] = mplp.Polygon(scaled_coords, closed=True)
--> 248     assign_fill_and_outline_to_row(shapes, fill_c, outline_c, row, idx)
    249     rows.append(row)
    251 elif geom.geom_type == "MultiPolygon":
    252     # mp = _make_patch_from_multipolygon(geom)

File ~/embl/projects/basel/spatialdata-plot/src/spatialdata_plot/pl/utils.py:235, in _get_collection_shape.<locals>.assign_fill_and_outline_to_row(shapes, fill_c, outline_c, row, idx)
    233     row["outline_c"] = outline_c
    234 else:
--> 235     row["fill_c"] = fill_c[idx]
    236     row["outline_c"] = outline_c[idx]

IndexError: index 167780 is out of bounds for axis 0 with size 167780
grst commented 5 months ago

I'm running into the same issue

LucaMarconato commented 1 month ago

Hi @timtreis I just noticed that this bug is still open. Do you have any timeline for a fix? Thanks 😊

LucaMarconato commented 3 weeks ago

Here is how to reproduce the problem. When the table has less rows than the shapes, the problem appears.

import matplotlib.pyplot as plt
from spatialdata.datasets import blobs_annotating_element
import spatialdata_plot

sdata = blobs_annotating_element("blobs_circles")
sdata['table'] = sdata['table'][:3]

sdata.pl.render_shapes("blobs_circles", color="instance_id").pl.show()
plt.show()

The behavior in napari, that would be expected from the user also here, is that the plot should be possible independently of which rows are present in the table and the shapes (it could be that the table has some rows in common, some missing and some extra; and no guarantee is expected from the order of the rows).

In napari shapes that are not present in the table are plotted in gray, while rows that are present in the table but not in the shapes are simply ignored. The order is matched via the join_spatialelement_table() API, which made the implementation easy.

LucaMarconato commented 3 weeks ago

The bug is related to this https://github.com/scverse/spatialdata-plot/issues/178 which if I remember well Sonja had fixed (and tests were added), I am looking for her PR. I think that plot can be made with any order so maybe the only uncovered cases are when there is a different number of rows between shapes and table. Using join_spatialelement_table() would fi all of this.

LucaMarconato commented 3 weeks ago

Here are the PRs from Sonja https://github.com/scverse/spatialdata-plot/pull/163/files and https://github.com/scverse/spatialdata-plot/pull/177/files.

So it seems that:

Practically, to address the issue I suggest to proceed as follows: