Open ArneDefauw opened 1 month ago
for aggregate points on labels. I've got this walk-around: https://github.com/BioinfoTongLI/Image-ST/blob/main/bin/to_spatialdata.py#L111-L118 it is however not very memory efficient.
for aggregate points on labels. I've got this walk-around: https://github.com/BioinfoTongLI/Image-ST/blob/main/bin/to_spatialdata.py#L111-L118 it is however not very memory efficient.
Hi @BioinfoTongLI , for aggregation of points on labels, I have this memory efficient implementation: https://github.com/saeyslab/harpy/blob/f563bcda27ac0138df94224840496a0d2680008b/src/sparrow/table/_allocation.py#L27
For aggregation of images and labels, we now have this implementation: https://github.com/saeyslab/harpy/blob/f563bcda27ac0138df94224840496a0d2680008b/src/sparrow/utils/_aggregate.py#L16
which basically does all aggregationsxrspatial.zonal_stats
and dask_image
implement, but much faster
Is your feature request related to a problem? Please describe.
spatialdata.aggregate
does not support 3D,spatialdata.aggregate
can be slow on large images.spatialdata.aggregate
does not allow aggregation of a labels layer with a points layer.Describe the solution you'd like
There is currently no 3D support in
spatialdata.aggregate
, e.g. aggregate of two raster elements:Aggregate can also be slow on large images:
While providing support for 3D with the current implementation using
xrspatial.zonal_stats
is quite straightforward (i.e. just iterate over the z-stacks), I think it makes sense to think about a custom implementation that does not rely onxrspatial.zonal_stats
for aggregation of raster data, and that is faster. E.g. the solution here to do aggregation of an image layer with a labels layer ('sum') https://github.com/saeyslab/harpy/blob/4f557433168f5b4cf8b776c884214b3254e9e6d2/src/sparrow/table/_allocation_intensity.py#L251 , is both much faster (30s on same hardware), and supports 3D.Aggregate does not support aggregation of points layer by a labels layer:
This means we have to convert a labels layer first to a shapes layer before we can do aggregation. Also, because shapes layer has limited support in 3D, supporting aggregation using a labels layer makes sense.
I understand that full 3D support is probably not the main priority, but to future prove some features of
spatialdata
I think it can be worthwhile to already think about it.