Open rafaqz opened 3 years ago
I agree we can make this change. This would be breaking for DataAPI.jl, so 2.0 release would be needed, which is a bit painful given then number of dependants https://juliahub.com/ui/Packages/DataAPI/3a8mN/1.8.0?t=2. However, I assume that only DataFrames.jl uses Between
now, so it would be just a matter of changing 1
to 1, 2
in Project.toml files.
In DataFrames.jl we could allow both 1 and 2 release of DataAPI.jl and appropriately handle both cases.
AN ALTERNATIVE that would be non-breaking would be to define Between
to hold two fields of any types (so then DD and AxisKeys.jl would need to accommodate). This would be less breaking (so I feel it is preferred), but maybe there are some benefits of holding a 2-tuple so let us discuss.
CC @nalimilan
If it's less breaking DimensionalData.jl can change to 2 fields, there is a breaking version bump coming up soon anyway, and it wont affect behavior as the constructor already accepts either Tuple
or 2 arguments.
The reason for using 1 field was so that all Selector
s have a val
field holding the thing to be selected, which is a single value or a Tuple
. Just to standardise that aspect. The val
method also returns that value. But otherwise it's not important.
DD also defines rebuild
methods for these to update the contents programatically, with ConstructionBase.setproperties
as the default keyword constructor. Keeping field names the same removes all boilerplate. You can rebuild
any selector with a modified value using the same syntax.
rebuild(x; kw...) = ConstructionBase.setproperties(x, (; kw...))
So that:
rebuild(sel; val=2 .* val(sel))
Or similar will just work, even when a Selector
has additional metadata fields, like atol
for At
. Otherwise updating Selctor
values progoramatically has to be special-cased for each selector type.
DD also defines
rebuild
methods for these to update the contents programatically
This is what I have feared that there might be such cases. Let us wait for @nalimilan to comment in the first place what he thinks about merging the definitions and then we will decide what path should be taken to merge them (as maybe we can agree that DataAPI.jl definition is "internal" and changing it is non-breaking as long as we keep the external API untouched).
Merging definitions sounds like a good idea, but tagging DataAPI 2.0 now is a no-go IMO. If we really want to switch to a single field, we could define appropriate constructors and getproperty
/setproperty!
methods to continue supporting the previous syntax.
That said, I don't really get the value of being able to do rebuild(sel; val=2 .* val(sel))
on any type of selector. Does happen that you don't know what kind of selector you have and just want to multiply the values by 2? As a counter-argument, it seems that rebuild(sel; last=2)
could be more often useful if you want to keep the starting point but change the ending point.
AxisKeys.jl takes Interval
is from IntervalSets.jl, and Not
from InvertedIndices.jl. I doubt these two will want extra supertypes (although IntervalSets.jl depends on EllipsisNotation.jl which I see now depends on ArrayInterface.jl so it's not 10 lines anymore.)
Its struct Near{T} <: Selector{T} val::T end
holds anything, and has the meaning of taking the nearest one element. I think that's the same meaning as DD.jl. The alternative meaning would be a value with a tolerance.
One small consideration to also mention is https://github.com/JuliaData/DataAPI.jl/blob/92b0def99ec396a50ab22d594ed9c37920537156/src/DataAPI.jl#L196. This is intended to allow for lazy broadcasting of the selector (i.e. that it can be properly expanded by a caller side that is aware of the actual range of indices selected). Would this break something in the other packages?
@nalimilan Multiplying values by 2 was a bad example... practically it's more like swapping out their contents, e.g. a Vector
, for the individual values of the vector. But 2 fields is fine.
But maybe DD can also move to using Interval
from IntervalSets.jl like AxisKeys, and this wont be an issue. I used Between
because the pattern is selectors as positional words. And interval is confusing when an index can represent intervals or points, both of with can be selected with Between
.
@mcabbott yes Near
means the same thing in both AK and DD.
DimensionalData.jl defines a
Selector
abstract type and selectorsBetween
,Near
,At
,Where
, andContains
.AxisKeys.jl also defines
Selector
andNear
andInervals
selectors (ping @mcabbott)It would be useful to standardise these so we can all use the simplest words without clashes, and this is probably the place for that.
We could define
Selector
here, and makeBetween <: Selector
.Near
could also move here to avoid the (maybe less common) clashes between AxisKeys.jl and DimensionalData.jl, if @mcabbott is also interested in sharing these types.Between
in DD holds a 2-tuple of any types, while here it has two fields limited toUnion{Int,Symbol}
.