scverse / spatialdata

An open and interoperable data framework for spatial omics data
https://spatialdata.scverse.org/
BSD 3-Clause "New" or "Revised" License
174 stars 34 forks source link

Error with `polygon_qurey()` function #532

Closed GuoshengMa closed 1 month ago

GuoshengMa commented 1 month ago

Hi @LucaMarconato ! A few days ago, following the instructions in the Annotating regions of interest with napari section, I successfully annotated regions of interest with napari. However, today I encountered an error when running the polygon_query() function following the same code.

Here is my code:

shape0 = (
    np.array(
        [
            [
                182.0,181.0,179.0,178.0,174.0,173.0,172.0,168.0,165.0,164.0,164.0,166.0,167.0,168.0,169.0,169.0,168.0,166.0,163.0,163.0,
                163.0,164.0,165.0,167.0,168.0,169.0,172.0,173.0,174.0,175.0,180.0,185.0,187.0,188.0,189.0,190.0,191.0,192.0,193.0,194.0,
                196.0,198.0,200.0,202.0,205.0,207.0,209.0,211.0,212.0,213.0,215.0,215.0,216.0,216.0,215.0,215.0,214.0,212.0,211.0,209.0,
                208.0,208.0,206.0,205.0,204.0,202.0,200.0,196.0,195.0,195.0,195.0,194.0,192.0,190.0,189.0,186.0,184.0,184.0,183.0,184.0,
                185.0,186.0,186.0,184.0,182.0,182.0,182.0,183.0,183.0,183.0,182.0,182.0,183.0,183.0,181.0,178.0,178.0,178.0,179.0,180.0,
                182.0,182.0,183.0,183.0,182.0,
            ],
            [
                105.0,105.0,106.0,107.0,108.0,109.0,110.0,112.0,114.0,115.0,118.0,121.0,122.0,123.0,125.0,126.0,129.0,136.0,140.0,141.0,
                143.0,151.0,153.0,155.0,157.0,158.0,160.0,162.0,162.0,163.0,168.0,173.0,174.0,174.0,175.0,176.0,178.0,179.0,179.0,181.0,
                181.0,182.0,183.0,183.0,183.0,183.0,182.0,181.0,179.0,177.0,174.0,173.0,170.0,167.0,164.0,160.0,159.0,158.0,159.0,161.0,
                161.0,161.0,158.0,156.0,156.0,156.0,157.0,160.0,161.0,162.0,164.0,164.0,164.0,162.0,161.0,160.0,159.0,157.0,156.0,153.0,
                151.0,148.0,147.0,145.0,144.0,143.0,141.0,139.0,137.0,136.0,135.0,134.0,132.0,131.0,129.0,126.0,123.0,120.0,118.0,115.0,
                112.0,111.0,110.0,109.0,105.0,
            ],
        ]
    )
    * 100
)
# fmt: on

gdf = GeoDataFrame({"geometry": [numpy_to_shapely(shape0)]})
gdf = ShapesModel.parse(gdf, transformations={"aligned": Identity()})
visium_sdata.shapes["lasso"] = gdf

from spatialdata import polygon_query

polygon = visium_sdata['lasso'].geometry.iloc[0]
polygon_query(visium_sdata, polygon, "aligned")["table"]

And here is the error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[32], [line 4](vscode-notebook-cell:?execution_count=32&line=4)
      [1](vscode-notebook-cell:?execution_count=32&line=1) from spatialdata import polygon_query
      [3](vscode-notebook-cell:?execution_count=32&line=3) polygon = visium_sdata['lasso'].geometry.iloc[0]
----> [4](vscode-notebook-cell:?execution_count=32&line=4) polygon_query(visium_sdata, polygon, "aligned")["table"]

File [c:\Users\mgssu\anaconda3\envs\spatialData\Lib\site-packages\spatialdata\_core\spatialdata.py:1662](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/spatialdata.py:1662), in SpatialData.__getitem__(self, item)
   [1649](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/spatialdata.py:1649) def __getitem__(self, item: str) -> SpatialElement:
   [1650](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/spatialdata.py:1650)     """
   [1651](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/spatialdata.py:1651)     Return the element with the given name.
   [1652](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/spatialdata.py:1652) 
   (...)
   [1660](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/spatialdata.py:1660)     The element.
   [1661](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/spatialdata.py:1661)     """
-> [1662](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/spatialdata.py:1662)     _, _, element = self._find_element(item)
   [1663](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/spatialdata.py:1663)     return element

File [c:\Users\mgssu\anaconda3\envs\spatialData\Lib\site-packages\spatialdata\_core\spatialdata.py:1574](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/spatialdata.py:1574), in SpatialData._find_element(self, element_name)
   [1572](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/spatialdata.py:1572)         return element_type, element_name_, element
   [1573](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/spatialdata.py:1573) else:
-> [1574](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/spatialdata.py:1574)     raise KeyError(f"Could not find element with name {element_name!r}")

KeyError: "Could not find element with name 'table'"

Then, I removed table and found it works:

from spatialdata import polygon_query

polygon = visium_sdata['lasso'].geometry.iloc[0]
polygon_query(visium_sdata, polygon, "aligned")
SpatialData object with:
└── Shapes
      ├── 'lasso': GeoDataFrame shape: (1, 1) (2D shapes)
      └── 'my_shapes_2': GeoDataFrame shape: (1, 1) (2D shapes)
with coordinate systems:
▸ 'aligned', with elements:
        lasso (Shapes), my_shapes_2 (Shapes)

Question1: However, this SpatialData object did not have table, which prevents us from obtaining the visium information within that region. How can this be resolved?

Question2: I noticed in the Annotating regions of interest with napari section, Limitations part, it mentioned that We currently don’t support saving circles, ellipses or segments as well as overwriting existing elements. The latter will be possible very soon though. I was wondering if they are supported now? Below is the error I encountered when trying to use circles:

codex_sdata
SpatialData object with:
├── Images
│     └── 'image': SpatialImage[cyx] (54, 2753, 2658)
├── Labels
│     └── 'labels': SpatialImage[yx] (1763, 1583)
├── Shapes
│     └── '3D2_spotID': GeoDataFrame shape: (8821, 2) (2D shapes)
└── Tables
      └── 'table': AnnData (5246, 53)
with coordinate systems:
▸ '3D2', with elements:
        image (Images), labels (Labels), 3D2_spotID (Shapes)
▸ 'downscaled_hires', with elements:
        image (Images), labels (Labels), 3D2_spotID (Shapes)
▸ 'downscaled_lowres', with elements:
        3D2_spotID (Shapes)
▸ 'global', with elements:
        image (Images), labels (Labels), 3D2_spotID (Shapes)
codex_sdata["3D2_spotID"]

image

from spatialdata import polygon_query

# filtered_tables = {}

polygon = codex_sdata["3D2_spotID"].iloc[4096]
polygon_query(codex_sdata, polygon, "3D2")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[9], [line 6](vscode-notebook-cell:?execution_count=9&line=6)
      [3](vscode-notebook-cell:?execution_count=9&line=3) # filtered_tables = {}
      [5](vscode-notebook-cell:?execution_count=9&line=5) polygon = codex_sdata["3D2_spotID"].iloc[4096]
----> [6](vscode-notebook-cell:?execution_count=9&line=6) polygon_query(codex_sdata, polygon, "3D2")

File [c:\Users\mgssu\anaconda3\envs\spatialData\Lib\functools.py:909](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/functools.py:909), in singledispatch.<locals>.wrapper(*args, **kw)
    [905](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/functools.py:905) if not args:
    [906](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/functools.py:906)     raise TypeError(f'{funcname} requires at least '
    [907](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/functools.py:907)                     '1 positional argument')
--> [909](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/functools.py:909) return dispatch(args[0].__class__)(*args, **kw)

File [c:\Users\mgssu\anaconda3\envs\spatialData\Lib\site-packages\spatialdata\_core\query\spatial_query.py:828](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/query/spatial_query.py:828), in _(sdata, polygon, target_coordinate_system, filter_table, shapes, points, images, labels)
    [826](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/query/spatial_query.py:826) for element_type in ["points", "images", "labels", "shapes"]:
    [827](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/query/spatial_query.py:827)     elements = getattr(sdata, element_type)
--> [828](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/query/spatial_query.py:828)     queried_elements = _dict_query_dispatcher(
    [829](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/query/spatial_query.py:829)         elements,
    [830](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/query/spatial_query.py:830)         polygon_query,
    [831](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/query/spatial_query.py:831)         polygon=polygon,
    [832](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/query/spatial_query.py:832)         target_coordinate_system=target_coordinate_system,
    [833](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/query/spatial_query.py:833)     )
    [834](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/query/spatial_query.py:834)     new_elements[element_type] = queried_elements
    [836](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/spatialdata/_core/query/spatial_query.py:836) tables = _get_filtered_or_unfiltered_tables(filter_table, new_elements, sdata)
...
    [149](file:///C:/Users/mgssu/anaconda3/envs/spatialData/Lib/site-packages/geopandas/_vectorized.py:149)     return np.array(out, dtype=object)

TypeError: Input must be valid geometry objects: geometry    POINT (4301 8687)
radius              40.540351
Name: 4096, dtype: object
melonora commented 1 month ago

Hi @GuoshengMa!

Regarding question 1, could you try setting the filter_tables parameter to False as in the API docs: https://spatialdata.scverse.org/en/latest/generated/spatialdata.polygon_query.html?

For the visium_sdata you can check whether the table in here is in fact annotating the particular shapes element. You can do this with the SpatialData.get_annotated_regions(<AnnData>) method: https://spatialdata.scverse.org/en/latest/generated/spatialdata.SpatialData.html#spatialdata.SpatialData.get_annotated_regions.

Regarding question 2, this is still not supported yet. The release was only last week, but we are working on it!

Please let me know if I can be of further assistance.

GuoshengMa commented 1 month ago

@melonora Great! Question 1 has been successfully resolved! However, I noticed that in the tutorial, this parameter has not been added yet. It would be advisable for you to update the tutorial to prevent others from encountering the same issue :D

In addition, I have another question. I am currently attempting to perform a joint analysis of 10× visium (ST) and CODEX data with the aim of obtaining the cellular composition of each spot in ST corresponding to CODEX. My approach is as follows:

  1. Align the ST and CODEX images using the SpatialData package.
  2. Utilize the spot_id information from ST and the polygon_query() function to obtain the corresponding CODEX data for each spot.

After the resolution of question 2, since the CODEX data provides the coordinates of each cell, I wonder if the afore-mentioned analysis can achieve my objective?

melonora commented 1 month ago

I believe this is a valid approach yes! @LucaMarconato any other ideas?

With respect to the notebook, I added a commit in this PR: https://github.com/scverse/spatialdata-notebooks/pull/91.

With that is your issue resolved @GuoshengMa ?

GuoshengMa commented 1 month ago

@melonora By setting the filter_tables parameter to False, I can successfully obtain tables. However, the tables I obtain do not seem to be filtered and are not specific to the polygon query region.

Here is my code:

# we rounded the coordinates to make it less verbose here in the notebook
# fmt: off
shape0 = (
    np.array(
        [
            [
                182.0,181.0,179.0,178.0,174.0,173.0,172.0,168.0,165.0,164.0,164.0,166.0,167.0,168.0,169.0,169.0,168.0,166.0,163.0,163.0,
                163.0,164.0,165.0,167.0,168.0,169.0,172.0,173.0,174.0,175.0,180.0,185.0,187.0,188.0,189.0,190.0,191.0,192.0,193.0,194.0,
                196.0,198.0,200.0,202.0,205.0,207.0,209.0,211.0,212.0,213.0,215.0,215.0,216.0,216.0,215.0,215.0,214.0,212.0,211.0,209.0,
                208.0,208.0,206.0,205.0,204.0,202.0,200.0,196.0,195.0,195.0,195.0,194.0,192.0,190.0,189.0,186.0,184.0,184.0,183.0,184.0,
                185.0,186.0,186.0,184.0,182.0,182.0,182.0,183.0,183.0,183.0,182.0,182.0,183.0,183.0,181.0,178.0,178.0,178.0,179.0,180.0,
                182.0,182.0,183.0,183.0,182.0,
            ],
            [
                105.0,105.0,106.0,107.0,108.0,109.0,110.0,112.0,114.0,115.0,118.0,121.0,122.0,123.0,125.0,126.0,129.0,136.0,140.0,141.0,
                143.0,151.0,153.0,155.0,157.0,158.0,160.0,162.0,162.0,163.0,168.0,173.0,174.0,174.0,175.0,176.0,178.0,179.0,179.0,181.0,
                181.0,182.0,183.0,183.0,183.0,183.0,182.0,181.0,179.0,177.0,174.0,173.0,170.0,167.0,164.0,160.0,159.0,158.0,159.0,161.0,
                161.0,161.0,158.0,156.0,156.0,156.0,157.0,160.0,161.0,162.0,164.0,164.0,164.0,162.0,161.0,160.0,159.0,157.0,156.0,153.0,
                151.0,148.0,147.0,145.0,144.0,143.0,141.0,139.0,137.0,136.0,135.0,134.0,132.0,131.0,129.0,126.0,123.0,120.0,118.0,115.0,
                112.0,111.0,110.0,109.0,105.0,
            ],
        ]
    )
    * 100
)
# fmt: on

gdf = GeoDataFrame({"geometry": [numpy_to_shapely(shape0)]})
gdf = ShapesModel.parse(gdf, transformations={"aligned": Identity()})
visium_sdata.shapes["lasso"] = gdf

image

image

image

The results from the tutorial are as follows:

{'lasso': AnnData object with n_obs × n_vars = 102 × 18085
     obs: 'in_tissue', 'array_row', 'array_col', 'spot_id', 'region', 'dataset', 'clone'
     var: 'gene_ids', 'feature_types', 'genome'
     uns: 'spatial', 'spatialdata_attrs'
     obsm: 'spatial',
 'my_shapes': AnnData object with n_obs × n_vars = 139 × 18085
     obs: 'in_tissue', 'array_row', 'array_col', 'spot_id', 'region', 'dataset', 'clone'
     var: 'gene_ids', 'feature_types', 'genome'
     uns: 'spatial', 'spatialdata_attrs'
     obsm: 'spatial',
 'my_shapes_2': AnnData object with n_obs × n_vars = 142 × 18085
     obs: 'in_tissue', 'array_row', 'array_col', 'spot_id', 'region', 'dataset', 'clone'
     var: 'gene_ids', 'feature_types', 'genome'
     uns: 'spatial', 'spatialdata_attrs'
     obsm: 'spatial'}

As you can see, in my case, it seems that polygon_query() did not return the corresponding AnnData object for the polygon. What could be the issue?

melonora commented 1 month ago

I will look into the issue a bit later. From our conversation though it is clear that the docstring for this function is not informative enough:) I will change the docstring as well, it is already correctly explained in the notebook that is part of the PR now though.

The main thing that filtered_table does is filter the rows that are present in the tables annotating elements in the resulting SpatialData object. It is filtered based on which rows rows of the element which the table is annotating are actually present in the volume of the polygon that you query. If this results in an empty table then this table is omitted. In case the parameter is set to False then all filtered tables are returned but without any filtering of the rows

GuoshengMa commented 1 month ago

Noted. I trust that you will be able to resolve this issue quickly. Looking forward to your response!