JuliaArrays / AxisArrays.jl

Performant arrays where each dimension can have a named axis with values
http://JuliaArrays.github.io/AxisArrays.jl/latest/
Other
200 stars 41 forks source link

Slicing until end #149

Open kcajf opened 5 years ago

kcajf commented 5 years ago

Is there an already implemented way to slice an AxisArray along an axis from a given point to the end, similar to how one can slice a normal Array with integers? For example:

B = AxisArray(randn(Float64, (10, 10)), Axis{:time}(Date(2018, 1, 1):Day(1):Date(2018, 1, 10)))
We can do this: 
B[5:end, :]
I would like to do something like:
B[Date(2018, 1, 5):end, :]
kcajf commented 5 years ago

This could be implemented by extending / using ideas from https://github.com/JuliaArrays/EndpointRanges.jl

kcajf commented 5 years ago

Did anyone have strong opinions on the best way to implement this? In particular I think it is important that it have a very simple syntax ideally similar to the current range slicing mechanism B[Date(2018, 1, 5)..Date(2018, 3, 4)].

timholy commented 5 years ago

Hmm, interesting question. The more consistent approach would be to leverage the IntervalSets-based slicing. Perhaps a macro that replaces begin and end with a concrete value? E.g., @axisslice(B[Date(2018, 1, 5)..end, :]).

kcajf commented 5 years ago

That's an option, but it feels a bit verbose and clunky.. This is an operation I tend to use a lot in python pandas and it is thankfully very syntax-light e.g.:

x = pd.DataFrame(...)
x.loc['2014-04-01':, :3]

Python at the language level converts the '2014-04-01': into a slice(2014-04-01', None) object and passes that to DataFrame.__getindex__(). It seems a bit of a shame that Julia doesn't provide an extensible way to use the begin and end keywords at the language level. Do you know if this is a deliberate decision / has some context, or just hasn't been needed for yet?

timholy commented 5 years ago

end is analyzed and then substituted by the parser, so we don't have an object we can dispatch on.

As an alternative to the macro, EndpointRanges could be broadened to included IntervalSets and then use Date(2018, 1, 5)..iend.

kcajf commented 5 years ago

Yes, exactly - I was saying that it is annoying that begin/end expansion happens at the parser level, making it inextensible. I'll look into EndpointRanges and see what can be done

kcajf commented 5 years ago

It might be a bit more work, but would IntervalSets benefit more generally from left-unbounded and right-unbounded intervals (for general, non-numerical types)?