rafaqz / DimensionalData.jl

Named dimensions and indexing for julia arrays and other data
https://rafaqz.github.io/DimensionalData.jl/stable/
MIT License
271 stars 38 forks source link

Combine multiple functions in the Where selector #533

Closed felixcremer closed 7 months ago

felixcremer commented 1 year ago

I am wondering whether we could combine multiple functions in the Where selector by selecting all values where one of them is true. My use case is to use functions like contains for which I can constructing a function which returns a boolean by calling them with one argument.

julia> arr = DimArray(rand(10,10,4), (X(1:10), Y(1:10), Dim{:Variable}(["root_moisture", "soil_moisture", "air_temperature", "something"])))
10×10×4 DimArray{Float64,3} with dimensions: 
  X Sampled{Int64} 1:10 ForwardOrdered Regular Points,
  Y Sampled{Int64} 1:10 ForwardOrdered Regular Points,
  Dim{:Variable} Categorical{String} String["root_moisture", "soil_moisture", "air_temperature", "something"] Unordered

julia> arr[Variable=Where(x->contains(x,"moisture") ||  contains(x,"temp"))] # This is annoying to write especially for more values.
10×10×3 DimArray{Float64,3} with dimensions: 
  X Sampled{Int64} 1:10 ForwardOrdered Regular Points,
  Y Sampled{Int64} 1:10 ForwardOrdered Regular Points,
  Dim{:Variable} Categorical{String} String["root_moisture", "soil_moisture", "air_temperature"] Unordered

julia> arr[Variable=Where(contains.(["moisture","temp"]))] # This would be a nice shorthand for the above function 
ERROR: MethodError: objects of type Vector{Base.Fix2{typeof(contains), String}} are not callable
Use square brackets [] for indexing an Array.
Stacktrace:
  [1] (::DimensionalData.Dimensions.LookupArrays.var"#39#41"{Where{Vector{Base.Fix2{typeof(contains), String}}}})(::Tuple{Int64, String})
    @ DimensionalData.Dimensions.LookupArrays ./none:0
  [2] iterate
    @ ./iterators.jl:514 [inlined]
  [3] iterate
    @ ./generator.jl:44 [inlined]
  [4] grow_to!
    @ ./array.jl:855 [inlined]
  [5] collect
    @ ./array.jl:779 [inlined]
  [6] selectindices
    @ ~/.julia/packages/DimensionalData/4TpBG/src/LookupArrays/selector.jl:858 [inlined]
  [7] _dims2indices
    @ ~/.julia/packages/DimensionalData/4TpBG/src/Dimensions/indexing.jl:114 [inlined]
  [8] macro expansion
    @ ~/.julia/packages/DimensionalData/4TpBG/src/Dimensions/indexing.jl:56 [inlined]
  [9] _dims2indices
    @ ~/.julia/packages/DimensionalData/4TpBG/src/Dimensions/indexing.jl:56 [inlined]
 [10] dims2indices
    @ ~/.julia/packages/DimensionalData/4TpBG/src/Dimensions/indexing.jl:51 [inlined]
 [11] dims2indices
    @ ~/.julia/packages/DimensionalData/4TpBG/src/Dimensions/indexing.jl:34 [inlined]
 [12] #getindex#39
    @ ~/.julia/packages/DimensionalData/4TpBG/src/array/indexing.jl:49 [inlined]
 [13] top-level scope
    @ REPL[27]:1
rafaqz commented 1 year ago

Maybe this is cleaner?

arr[Variable=Where(x -> any(occursin(x), ("moisture", "temp")))]

I think it will be hard to get shorter than that.

But we could something like a Matches selector that did that, like

arr[Variable=Matches(contains, ("moisture, "temp"))
rafaqz commented 7 months ago

I think my first suggestion is probably "good enough"