multiscales subsetting - Githubissues

HelenaLC commented 4 days ago

It's not clear to me how to implement subsetting of multiscale ZarrArrays. E.g., say we have scales 4x4 and 2x2. Then the only subsets that could be propagated to all resolutions would be 1..2 and 3..4, which would keep the 1st and 2nd low-res. index, respectively. How is this handled on the Python side? Shall we simply restrict to valid indices (i.e., throw an error)? Or drop low-res. scales that don't comply, e.g., when subsetting 2..3 (potentially with an argument drop=TRUE/FALSE, default FALSE, that would make users aware/in control of this)? ...perhaps worth thinking about whether Mike Smith has this on his radar, i.e., representing the pyramid beyond one layer at a time.

Artur-man commented 4 days ago

Let me give you an idea here, this subsetting strategy work very well in when you do subset based zooming in voltron using pyramid images:

https://github.com/BIMSBbioinfo/ImageArray https://x.com/AltunaAkalin/status/1841864608073572707

ImageArray is programmed the same way that the current multiscale is implemented, when you subset you use the indices of the first scale, and the indices should be scaled down to subset other layers lazily.

HelenaLC commented 4 days ago

Thanks for the link, and yeah, I got that part (i.e., scaling down indices to subset low-res.). However, this doesn't resolve my confusion. Specifically,

the example given for ImageArray works because the numbers nicely scale down
e.g., 2000:3000 (length 1000) becomes 250:375 (length 125) ...or something like that
however, what would happen for the example I gave above, when high-res. indices cannot be scaled to low-res.?
e.g., (from above) arrays 4x4 and 2x2 and index 2:3 wouldn't make any sense for the 2x2 res.

HelenaLC commented 4 days ago

Based on the code here, I would answer my question by: the low-res. is kept completely in case indices don't map. E.g.,

# imagine a 4x4 array 'a', and calling 'a[2:3, ]';
# here, 'i=2' would be a 2x2 array
> curind <- c(2, 3); i <- 2
> seq(
  floor(head(curind,1)/(2^(i-1))), 
  ceiling(tail(curind,1)/(2^(i-1))))
[1] 1 2

Artur-man commented 4 days ago

I thought a lot about this when I was programming pyramids, for large images this small ceilling and floor operations to not really matter (you can interactively see it), but for small examples it looks a bit weird.

HelenaLC commented 4 days ago

Yeah, it's quite intelligent/elegant that little seq() you got there! I think I'll use this for now. I just got confused when subsetting the blobs and nothing happens. The whole pyramid thing is something that hopefully will come in from else where (e.g., Rarr) eventually... till then, we'll just hack our way through it :)

vjcitn commented 4 days ago

Just sticking my nose in. Comments in code can be really crucial, maybe linking back to these issues. Explaining the seq chunk fullly may ward off technical debt payments down the road; apologies if it is already there. Also if there is anticipation of new functionality in Rarr do we need to raise an issue there? I am swamped with other items so not contributing much now but let me know if I am blocking anything.

HelenaLC / SpatialData

multiscales subsetting #90