Closed sethaxen closed 8 months ago
I thought what was missing was the following method
julia> @inline Dimensions.dims2indices(::Tuple{}, I) = ()
But this has an unintended result. Indexing by a missing keyword then behaves like x[]
:
julia> x[c=1]
3
julia> ds = DimStack((; x, y));
julia> ds[c=1]
┌ Warning: (Dim{:c},) dims were not found in object
└ @ DimensionalData.Dimensions ~/.julia/packages/DimensionalData/K9D4P/src/Dimensions/primitives.jl:645
(x = 3, y = [-1.5090420829521751 -0.629637266507813 -1.1542533262114874; 1.78624724297272 -1.341032743849202 0.2629446089245767])
I would expect x[c=1]
to return x
as a 0-dimensional array with a warning, and similarly I would expect indexing with ds
to return the dataset unchanged.
It is pretty weird I agree. But I would also have defined your missing method, expecting the result you got.
We fill in missing dimensions with :
in dimensional indexing so we can be concise. But here it means calling getindex(x)
after splatting in the empty tuple.
So I guess you are asking to special case this to not call getindex
at all for zero dim arrays? Maybe only with DimStack
?
That also feels pretty wierd to me, getindex
call on a zero dim array always returns a scalar (unless you add extra colons). But honestly I don't know what we should do here and if it should be different to Base, thats probably why the bug is still here.
view
actually does what you want here by the way, as Base doesn't unwrap with view(x)
. That could be a way to deal with this.
Edit: another way to explain why I expect your result is that getindex
always creates a new array for all stack layers, and it still gets called on each layer even when there are no matching dims - just with colons for all dimensions.
This keeps to to our mental model for arrays - that altering an object after getindex
does not change the original.
Maybe what we want for DimStack
is the exact opposite of Base.dotview
: an indexing method that always assigns a new array, but never unwraps zero dims - it copies them instead.
Perhaps I have the wrong mental model of DimStack
, but I would expect that getindex
with just keyword arguments would always return a DimStack
. It's strange to me that if I put a 0-dimensional array in the DimStack
, then I suddenly don't get a DimStack
output anymore when I use keyword arguments to getindex
. The result isn't even a valid Table.
view
actually does what you want here by the way, as Base doesn't unwrap withview(x)
. That could be a way to deal with this.
That's good to know and could let me work around this. I should probably be view
ing more than getindex
ing anyways.
It was hard to work out the behaviour when you index arrays with different dimensions all at the same time - because getindex
returns a scalar if you index with all Int
, but an array with Colon
or a range. So if we index with Int
for all the dimensions of the smallest array we end up with a mix of scalars and arrays.
But from this discussion I think using regular getindex
with DimStack
was actually a mistake - we should instead define a noscalargetindex
method I described in my last comment - that always returns an array.
Then we always have a DimStack
of DimArray
that can become a DimTable
. Even when all of the arrays have no dimensions after indexing.
Would that be better?
But from this discussion I think using regular
getindex
withDimStack
was actually a mistake - we should instead define anoscalargetindex
method I described in my last comment - that always returns an array.
If such a method existed that would indeed be useful. But IIUC, this could be implemented as
noscalargetindex(s::AbstractDimStack; kwargs...) = copy(view(s; kwargs...))
so maybe it's better to just document that users who want this behavior should use this approach instead of having a special method?
Ah wait, it seems view
is not guaranteed to return an AbstractDimStack
?
julia> x = DimArray(randn(2), :a)
2-element DimArray{Float64,1} with dimensions: Dim{:a}
1 -0.753079
2 -0.502682
julia> ds = DimStack((; x))
DimStack with dimensions: Dim{:a}
and 1 layer:
:x Float64 dims: Dim{:a} (2)
julia> view(ds; a=1)
(x = fill(-0.753079230064672),)
That's a problem here rather than in Base so easy fix. Maybe view
has the same check as getindex
to not rebuild, that we just need to speialise.
That's a problem here rather than in Base so easy fix. Maybe
view
has the same check asgetindex
to not rebuild, that we just need to speialise.
Ah, it seems to be due to this inconsistency:
julia> x = DimArray(randn(2), :a);
julia> y = DimArray(randn(2, 3), (:b, :c));
julia> view(x; a=1) # a `DimArray` is not returned
0-dimensional view(::Vector{Float64}, 1) with eltype Float64:
-1.1244365106893395
julia> view(x, 1) # what the above dispatches to
0-dimensional view(::Vector{Float64}, 1) with eltype Float64:
-1.1244365106893395
julia> z = DimArray(fill(randn()), ()); # also a problem for 0-dim arrays
julia> view(z) # fine with no indices
0-dimensional DimArray{Float64,0}:
0.477376
julia> view(z, 1, 1) # or 2
0-dimensional DimArray{Float64,0}:
0.477376
julia> view(z, 1) # but not a DimArray with 1 index
0-dimensional view(::Vector{Float64}, 1) with eltype Float64:
0.4773758383436243
julia> view(y; b=1, c=1) # if we index all dimensions, a `DimArray` is still returned
0-dimensional DimArray{Float64,0}: and reference dimensions: Dim{:b}, Dim{:c}
0.364411
julia> view(y, 1, 1) # what the above dispatches to
0-dimensional DimArray{Float64,0}: and reference dimensions: Dim{:b}, Dim{:c}
0.364411
The issues seems to be this overload, https://github.com/rafaqz/DimensionalData.jl/blob/70f2aefc6c4248a0ab957f51cf8b415f9799a39e/src/array/indexing.jl#L17
For more than one integer index, there's a view
-specific overload:
https://github.com/rafaqz/DimensionalData.jl/blob/70f2aefc6c4248a0ab957f51cf8b415f9799a39e/src/array/indexing.jl#L43-L49
I can open a PR fixing this.