rafaqz / DimensionalData.jl

Named dimensions and indexing for julia arrays and other data
https://rafaqz.github.io/DimensionalData.jl/stable/
MIT License
273 stars 38 forks source link

`end` doesn't work correctly #790

Open greimel opened 1 month ago

greimel commented 1 month ago

The order of arguments matters when indexing with end

julia> a = rand(X(1:5), Y(1:10))
╭──────────────────────────╮
│ 5×10 DimArray{Float64,2} │
├──────────────────────────┴────────────────────── dims ┐
  ↓ X Sampled{Int64} 1:5 ForwardOrdered Regular Points,
  → Y Sampled{Int64} 1:10 ForwardOrdered Regular Points
└───────────────────────────────────────────────────────┘
 ↓ →  1          2          3         …  8         9          10
 1    0.0651066  0.281023   0.997647     0.195063  0.18767     0.535553
 2    0.837051   0.976627   0.819436     0.609175  0.181514    0.995697
 3    0.113039   0.337178   0.511845     0.497278  0.254235    0.479985
 4    0.494619   0.142592   0.359521     0.337024  0.0141267   0.738089
 5    0.424197   0.0445322  0.879901  …  0.153952  0.260781    0.711908

julia> a[X = 1:1, Y = end:end]
╭─────────────────────────╮
│ 1×1 DimArray{Float64,2} │
├─────────────────────────┴──────────────────────── dims ┐
  ↓ X Sampled{Int64} 1:1 ForwardOrdered Regular Points,
  → Y Sampled{Int64} 10:10 ForwardOrdered Regular Points
└────────────────────────────────────────────────────────┘
 ↓ →  10
 1     0.535553

julia> a[Y = end:end, X = 1:1]
╭─────────────────────────╮
│ 1×1 DimArray{Float64,2} │
├─────────────────────────┴─────────────────────── dims ┐
  ↓ X Sampled{Int64} 1:1 ForwardOrdered Regular Points,
  → Y Sampled{Int64} 5:5 ForwardOrdered Regular Points
└───────────────────────────────────────────────────────┘
 ↓ →  5
 1    0.852859

It uses the last index in the first dimension (5), not in the Y dimension.

rafaqz commented 1 month ago

In Julia end is parser level syntax. There is no way we can influence what it does in a package, as we can't see that code at all unless you use macros for everything.

But the good news is DD implements type level begin and end with Begin and End, and these will work in way more places than base end. Like in view.

greimel commented 1 month ago

I see. A big warning in the README/docs would be nice (did I miss it?). Using begin/end can lead to bugs that are pretty hard to find.

(I like your package very much btw!)

rafaqz commented 1 month ago

Yeah I just haven't documented it well, and its quite new. If you want to add it to the docs and the readme that would be very helpful.

What we have now is just docs for Begin and End, they could also have examples, but at least give you the API definitions:

help?> End
search: End end endswith ENDIAN_BOM append! QuoteNode getindex setindex! prevind prepend! expanduser seekend nextind checkindex parentmodule parentindices isdefined LinearIndices @isdefined eachindex

  End <: Position

  End()

  Used to specify the end index of a Dimension axis, as regular end will not work with named dimensions. Can be used with : to create a BeginEndRange or BeginEndStepRange.

  Also used to specify lookup values correspond to the end locus of an interval.

help?> Begin
search: Begin begin disable_sigint reenable_sigint

  Begin <: Position

  Begin()

  Used to specify the begin index of a Dimension axis, as regular begin will not work with named dimensions.

  Can be used with : to create a BeginEndRange or BeginEndStepRange.

But you're right, discoverability is important as this can lead to bugs. Its just one of those things that once you realise A[1, end] is just a dumb built in coversion to A[1, lastindex(A, 2)] you would never try it again on a DimArray. But before that it seems like a horrible bug.

asinghvi17 commented 4 weeks ago

Hmm, do Begin and End work with regular abstract arrays as well? That would be pretty interesting to have for me - wouldn't want to accidentally use the wrong thing when dealing with mixtures of dimarrays and regular arrays.

rafaqz commented 11 hours ago

Yes mostly they will work with regular arrays too. Possibly slightly more widely if you construct them as Begin() and End().