scverse / spatialdata

An open and interoperable data framework for spatial omics data
https://spatialdata.scverse.org/
BSD 3-Clause "New" or "Revised" License
174 stars 34 forks source link

Different cell ids selected in regions: spatialdata vs Xenium explorer #507

Open EST09 opened 1 month ago

EST09 commented 1 month ago

Hi,

Thank you for the help with the polygons previously. We're trying to investigate some polygons of interest but we get different cells selected when we investigate the same regions in Xenium explorer as opposed to spatialdata. I'm assuming they would be cells on the edge of the region (110 cells differ in a selection of ~12800). Is there any way to plot these cells, by cell id, that differ to be sure?

Thank you!

Best wishes, Emily

LucaMarconato commented 1 month ago

Hi, I have extended the "napari rois" documentation notebook with an example where regions being selected are plotted https://spatialdata.scverse.org/en/latest/tutorials/notebooks/notebooks/examples/napari_rois.html. The code could be easily adapted for your use case.

Also, another user asked about the adding possibility to control the cells that are considered during the query using by specifying the degree of overlap. This is still not implemented but we may do this. For the moment please find the discussion here https://github.com/scverse/spatialdata/discussions/472, which also discusses some approaches to this.

EST09 commented 1 month ago

Thank you, I've given it a try but I keep getting "only integer scalar arrays can be converted to a scalar index".

I've had to do .table as it wasn't found using ["table"] so this may be the issue?

categories = ["same", "different"]
n = len(cropped_sdata.table)

cropped_sdata.table.obs["annotation"] = pd.Categorical(["same" for _ in range(n)], categories=categories)

#get indexes of cell_ids in diff

diff_indexes = [cropped_sdata.table.obs.loc[cropped_sdata.table.obs["cell_id"] == cell].index[0] for cell in diff]

cropped_sdata.table.obs.loc[diff_indexes, "annotation"] = "different"

import matplotlib.pyplot as plt
import spatialdata_plot

plt.figure(figsize=(12, 7))
ax = plt.gca()

cropped_sdata.pl.render_shapes("cell_boundaries", color="annotation").pl.show(ax=ax)

Thank you

Best wishes, Emily

EST09 commented 1 month ago

It's happy if I plot the cells in the whole data:

where diff is a list of cell ids

for i in diff:
    sdata.table.obs.loc[sdata.table.obs["cell_id"] == i, "new_region"] = "0"

sdata.table.obs.loc[sdata.table.obs["new_region"] != "0", "new_region"] = "1"

#unique new regions
sdata.table.obs["new_region"].unique()

#plot sdata coloured by new region

f, ax = plt.subplots(figsize=(20, 20))

sdata.pl.render_shapes(color="new_region").pl.show(ax=ax)

but not if I do the same in cropped_sdata. It seems that the differing cells are on the edge so that's hopefully why - thank you for all your help and a great package!

Best wishes, Emily

LucaMarconato commented 1 month ago

You are welcome 😊

Thanks for the clarification; to help investigating the problem could you please add a reproducible example? Maybe on the same Visium data from the napari_roi notebooks. Another option could be to use a small example dataset accessible with

from spatialdata.datasets import blobs

sdata = blobs()

For querying the data you could use the bounding box query, or hardcode the coordinates of the shapely.Polygon.

Thank you, it would make troubleshooting easier 😊