Open MiTPenguin opened 2 months ago
Hi, thanks for reaching out!
With squidpy, plotting of individual ROI from an AnnData containing multiple ROI can be controlled supplying the parameter library_id and library_key. Is there an equivalent concept in SpatialData? For example, if I'm rendering an Image (a ROI), a set of cell Shapes, and want to color it with a meta data from the table (AnnData), from a SpatialData object containing multiple ROIs, how would the function determine the correct/corresponding data to pull from each different modules? Is it primarily through unique coordinate systems?
Yes, you can plot a specific ROI even if the table contains multiple ROIs. The spatialdata-plot
and napari-spatialdata
libraries take care of matching the table to the ROIs. This is link is given by the region
, region_key
and instance_key
metadata of the table, explained in the docs for [TableModel.parse()](https://spatialdata.scverse.org/en/latest/generated/spatialdata.models.TableModel.html)
and shows in this example notebook.
For very large objects, manually subsetting before plotting may be more performant (but if performance is an issue please report and we can optimize the automatic subsetting). An example of subsetting the data for a dataset similar to yours is found here (3 ROIs, 1 table). Minor note, in the next release pp.get_elements()
will be replaced by .subset()
, just that you know.
I know there's an "instance_id" parameter (that's inserted when I use the legacy conversion function): but does the instance_id have to be unique across the entire dataset? How about for cell segmentation mask, where the ID is necessarily integer only?
region_key
and instance_key
are the name of two columns that must be present in a table that is annotating 1 or more samples. Each pair of values (=each row of these two columns) must be unique. Uniqueness for instance_key
values alone is not required as it would be too restrictive.
How should we set coordinate system in a dataset like this? Should it be a coordinate for each sample? for each ROI? And is there a faster way to set it up instead of just looping through set_transformation multiple times?
I suggest to have one coordinate system per sample, and in addition one coordinate system per ROI. Currently the only way to proceed is looping over, for instance this is what we do in this notebook in the function postpone_transformation()
. We prepared a new design that will remove the need for loops, but it will take quite some time before we finish implementing it.
graphing connectivity maps: is there a built in function that would graph connectivity maps? or do we just have to layer it using the sq.pl package?
We don't have such function, please use indeed squidpy
.
Another comment. In your case you may benefit from what we discussed in this issue: https://github.com/scverse/spatialdata/issues/398, what do you think about it?
Hi, I am putting this in the
-io
package git page, but let me know if it's better to go in the general package page.I'm dealing with a set of mIF data that have been previously processed (similar to mcmicro, but not exactly). The data set consists of:
I have previously been able to translate, with a lot of trial & error, this format of data set into Squidpy compatible AnnData, and do analysis and plotting, on individual ROIs, sample, etc.
With the new SpatialData object, it's not clear to me how I should best approach constructing it. Here are my questions:
library_id
andlibrary_key
. Is there an equivalent concept in SpatialData? For example, if I'm rendering an Image (a ROI), a set of cell Shapes, and want to color it with a meta data from the table (AnnData), from a SpatialData object containing multiple ROIs, how would the function determine the correct/corresponding data to pull from each different modules? Is it primarily through unique coordinate systems?set_transformation
multiple times?Hopefully this is clear. I'm looking through the different example datasets, but I haven't found one that seems to emulate this dataset format.