rafaqz / DimensionalData.jl

Named dimensions and indexing for julia arrays and other data
https://rafaqz.github.io/DimensionalData.jl/stable/
MIT License
282 stars 42 forks source link

should `eachslice` return a `DimArray`? #850

Open simeonschaub opened 1 week ago

simeonschaub commented 1 week ago

I think it would be useful if eachslice did something like the following, so the resulting slices array kept the sliced dimension:

function eachslice2(x::AbstractDimArray; dims, drop=true)
    dim = DimensionalData.dims(x, dims)
    return DimArray(eachslice(x; dims, drop), dim)
end

Does this sound reasonable?

asinghvi17 commented 1 week ago

AFAIK this is why refdims exists in dimarrays, maybe each slice just needs to add that?

asinghvi17 commented 1 week ago

https://github.com/JuliaLang/julia/blob/8593792f8f5212d5513fe0829253664c3ceedd2a/base/slicearray.jl#L228-L245 is the code in Base that does this - we'd simply have to reimplement this for a DimSlices struct here, that would return rebuilt / modify'ed (?) versions of the dimarray (since this should be global to Rasters, YAXArrays, etc), with refdims set to the correct value.

DD already defines Base _eachchunk in methods.jl for arrays and stacks

simeonschaub commented 1 week ago

JuliaLang/julia@8593792/base/slicearray.jl#L228-L245 is the code in Base that does this - we'd simply have to reimplement this for a DimSlices struct here, that would return rebuilt / modify'ed (?) versions of the dimarray (since this should be global to Rasters, YAXArrays, etc), with refdims set to the correct value.

Could you elaborate on what this would look like?

rafaqz commented 1 week ago

Basically we have DimSlices here already for this, but we don't use this for eachslice on AbstractDimArray to be consistent with Base. We use it for DimStack where Slices limits the type to AbstractArray.

I think @sethaxen had a reason for not using DimSlices for both.

But we could shift to using DimSlices for AbstractDimArray as well, then you could index by dimensions. The question is how much are other packages expecting a Slices object from eachslice and does it matter.

rafaqz commented 4 days ago

@simeonschaub do you have thoughts on what the expected return type of eachslice is, and if it matters much?

simeonschaub commented 4 days ago

For my use case it would be really nice if eachslice returned a DimSlices object, I didn't know that already existed. Having base functions on AbstractDimArray return AbstractDimArrays as well seems like the most useful behavior to me, since it's always annoying to lose axis information and having to manually add it back.

rafaqz commented 4 days ago

Yeah, it makes sense.

I think this wording of the docs is the main problem, it specifically says it returns a Slices object for AbstractArray:

help?> eachslice
search: eachslice eachsplit

  eachslice(A::AbstractArray; dims, drop=true)

  Create a Slices object that is an array of slices over dimensions dims of A

We can't even make it <: AbstractSlices without a lot of work to make dimension indexing work on it, as it needs to be AbstractDimArrayGenerator to get everything for free.