Closed timholy closed 10 years ago
Looking at this in isolation, it sounds fine to me. However, problems may arise if we consider it in conjunction with broadcasting.
Consider this:
A .+ A[1,:]
At the first glance, I would suppose that the behavior is to add the first row of A
to each row. However, if we squeeze out the first dimension, this would fail.
To extend this little bit, shall we squeezed reduced dimension for reduction operation? What would be the shape of sum(A, 1)
? When we want to make decision about this, I would invite people to consider the following example that represents a very common operation in practice:
A .- mean(A, dim)
It would simply become A .+ A[1:1,:]
. Range
-indexing wouldn't squeeze, Int
-indexing would.
Another perspective on this: currently slice
works this way, and sub
never squeezes. If we're basically getting rid of those functions in preference to getindex
, then slice
's behavior may be the one we need to emulate, since sub
doesn't give you any choices.
In my view, it's pretty clear that reductions shouldn't squeeze, but I recognize this is potentially contentious.
See also https://github.com/JuliaLang/julia/issues/4774 and https://github.com/JuliaLang/julia/issues/3262. I kind of wish there weren't so many issues about basically the same thing.
I would be happier with not dropping any dimensions and having slicing always return a tensor of the same rank as the tensor that is being sliced. Are there any downsides to that? Vectors would tend to get turned into column matrices, but that's fine, imo. We could also make [1;2;3]
and the corresponding form with newlines construct a column matrices instead of vectors.
I think that would be ok; I suspect it's more important to formalize the embedding of N-d tensors in (N+1)-d tensors with a trailing singleton dimension. Then extra dimensions can generally be ignored, rather than explicitly dropped.
I'm also a big supporter of @timholy's proposal. Since in Julia you can distinguish scalars from vectors and ranges, it sounds natural to allow using this difference to indicate what the shape of the resulting slice should be.
I have an unorthodox proposal: how about someone who's in favor of this go ahead and implement a Julia array type that has this indexing behavior. It doesn't have to be completely functional – I just think it would help a lot to have a mockup to play with. Then people (like me), who are skeptical of this idea can try it out and see if it's really as odious as it sounds ;-)
I opened this issue because I'm basically intending to do just that; this is the "shot across the bow" from someone with a dangerous track record :smile: when it comes to reworking much of Julia's array indexing. Like you, I don't actually know how this will work out, and I am kind of curious to know whether I will like it.
But my general feeling is that perhaps this should be timed with other disruptions, like switching to views---both will break code, this much more than views. That said, if you really would play with this well in advance of getting serious about merging it, then perhaps one doesn't need to wait.
I think the issue has served its purpose, and the next step is just to do it. So this can be closed.
@nalimilan, while I agree with you in that this proposal is logical and adds power---which is why I think it's worth exploring---the part I'm worried about is that it may be so much more common to want not to slice dimensions that I'll get annoyed at writing ranges all the time. Moreover, :
has quite low precedence and gets used both for ranges and for conditionals (the x?a:b
syntax), so this will inevitably add to the number of parentheses in code.
Actually, let me use this to ask one more thing: if I make the core changes to base/
, anyone willing to take the lead in modifying the scripts in test/
to be consistent with the new behavior?
Should we go full APL and make the rank of the result the sum of the ranks of the indexes?
Instead of changing what's in base, can you add a new type? (Maybe that's what you plan.)
@JeffBezanson, can you point me to a link? I think I understand what you're asking, but not sure. I'd have to think about the implementation, but it sounds doable.
@tshort, I don't think it will be an effective experiment if we don't really try it. IF we do try it (and it's by no means too late to convince me otherwise), my plan would be to post a branch and people could play with it. I'm also mulling over whether there's a way to make the transition safely with an appropriate macro or two. It's not obvious there is, but I think it's worth careful consideration.
I we do make this change, then it only makes sense to go "full APL", imo.
Can someone provide a link on what APL indexing semantics actually are? Given @JeffBezanson's description, it seems more profound than just dropping singleton dimensions or not.
Also, since there seem to be definite use cases for both dropping and keeping dimensions, how about that we provide both operations? This could even make the transition non-breaking.
Or does slice
already do what is wanted? Could it? (I didn't even realize that we had it, so I'm not sure exactly what it does)
Agree that we may seriously consider adopting the APL rules, which are much more expressive.
Having looked at the rules briefly, yes, it sounds like a good idea to go all the way if we do this.
@toivoh, the new behavior would imitate slice
:
julia> A = reshape(1:4, 2, 2)
2x2 Array{Int64,2}:
1 3
2 4
julia> A[1,:]
1x2 Array{Int64,2}:
1 3
julia> sub(A, 1, :)
1x2 SubArray{Int64,2,Array{Int64,2},(Range1{Int64},Range1{Int64})}:
1 3
julia> slice(A, 1, :)
2-element SubArray{Int64,1,Array{Int64,2},(Int64,Range1{Int64})}:
1
3
julia> slice(A, 1:1, :)
1x2 SubArray{Int64,2,Array{Int64,2},(Range1{Int64},Range1{Int64})}:
1 3
What's your idea for making this non-breaking? I currently have what might be an effective, but somewhat complex, plan for migration. The key is to use deprecation on indexing operations whose behavior is changing, but provide users a mechanism for indicating that other indexing operations should follow the new behavior and therefore not be subject to the deprecation behavior:
getindex(A, i::Int, j::AbstractVector)
would return a warning, but follow the old behavior (using @deprecate
):newref
, that gets converted into getindex_apl/setindex!_apl
operations. This is the part I'm currently least sure about implementing.@ni
and @newindex
. The first is used on single expressions, @ni A[3,:]
to create a :newref
expression. The second wraps whole blocks of code and converts all :ref
s into :newref
s. The safest migration path would be to use @ni
on an expression-by-expression basis until you're sure you've converted them all, then clear them all out by wrapping the entire thing in @newindex
. And new code could start with @newindex
right away, of course.A bit ugly, but I'm not sure I see a good alternative.
I'm going to be occupied for the next several weeks with other tasks, and in any event this is 0.4 material. Let me make sure that folks noticed my request for collaboration on this issue, specifically on converting the files in /test
to the new behavior: CC @isorber, @wenxgwen, @hayd, @BobPortmann, @nalimilan.
While I am curious to know how this indexing would work out in practice, much of my motivation for being willing to try it is because of the complaints about the current scheme. (I've implemented much of Julia's current array indexing, so I feel a certain amount of responsibility for handling complaints about how it works.) I'm reasonably happy with Julia's current behavior; if none of you are willing to share the pain of this transition, then I know I don't need to feel guilty about anything :smile: and just might find better uses for my time.
Looks like that should have been CC @lsorber.
Forgot @malmaud
@timholy If we agree to change behavior (in whatever way), better to do it along with the array view framework. If we are going to deprecate sub-arrays in favor of array views (with the GenericView
, it should 100% covers what a sub-array can do), then we can implement this behavior in array views, and delegate getindex
to view
.
@timholy Full APL sounds good to me.
@lindahua, 100% agree that if we're going to make one breaking change in Arrays, it's the perfect time for a second. In fact I wonder about making 0.4 a quick release (like, 1 month) focused on breaking changes to the Array API. That way people don't have to have code hanging around for a long time with @newindex
around everything (which will slow down parsing).
@malmaud, since you're one of several who has argued for this, what I was actually CCing you on was the request for a collaborator on implementation (see above). Was that a yes? :smile:
@timholy Oh hah, sure. I'd be excited to help out with the APLification.
I agree to focus on stabilizing the Array API in the 0.4 cycle.
Before we actually implement any thing or revise real codes, what about somebody put up together a specification of the array API (in a gist or somewhere)?
I would like to help, but don't have much time this week due to an upcoming deadline.
@malmaud, that would be great!
@lindahua, I don't have time for the next several weeks (mid/late March grant deadline). But after that I'll be ready to charge full steam ahead.
It seems to me that full APL allows each of the indices to be an array of any dimension, but I still haven't seen a clarification on how it behaves when using more than one index of dimension greater than 1.
Also, the non-breaking alternative I was talking about was to keep getindex
more or less as it is (maybe make it not drop trailing dimensions either?), and use slice
when you want APL indexing. I'm not sure whether it's a good alternative, but I feel that it might deserve a little discussion.
+1 for using slice for full APL indexing, plus either dropping trailing singletons like we currently do or absent dimensions.
I'm assuming it's supposed to act like reshape(A[ind1[:], ind2[:]], tuple(size(ind1)..., size(ind2)...))
.
Your alternative deserves discussion. We'd want both getslice
and setslice!
, of course.
Of course. And yes, that is what I would suppose as well.
I could be persuaded, though the right course is not obvious. I think the main argument the other way is that because the transition to returning views will be breaking, it's better to time any other breakages to go along with it. And it seems few know about or use slice
now, even though it implements the kind of indexing under discussion. (It returns a SubArray, of course, and that may account for its lack of popularity.)
But if we're unsure whether we even want to go there, introducing getslice
and setslice!
would still be a step forward.
Other folks? It's an important decision.
I'm not sure what's the best solution between making getindex()
drop dimensions and only making getslice()
act this way. But I really don't like the idea of dropping trailing singleton dimensions and not others; this sounds really confusing and does not make much sense in my own practice.
Maybe we should make a list of use cases where dropping dimensions indexed by scalars is desirable or not, and see what's the most common case, and whether each behavior has cases in which it's really annoying?
Another solution would be to add a {}
indexing operator which would call getslice()
. That way it's easy to get the behavior you want.
The { }
syntax is already spoken for, so that's not viable, I'm afraid. Listing pro and con use cases seems like a good idea, although we can't really just count them up – there's still a huge amount of subjectivity about which ones are important.
I just realized that the "absent" dimension idea doesn't actually let you avoid dropping trailing singleton dimensions. For one thing, it's somewhat undesirable to have three things that are supposed to be column vectors: 1-d vectors, n x absent "vectors", and n x 1 matrices. For another thing, it wouldn't solve the v'*v
problem – namely, you want this to be the dot product and produce a scalar, not an "absent x absent" tensor. I guess you could get that just be special-casing situations where the resulting tensor would have only absent dimensions as producing scalars.
I also realized that since reductions like sum(A,d)
and mean(A,d)
already produce a tensor of the same rank as their input, A ./ sum(A,d)
would continue to work correctly, regardless of slicing behavior, unless we also insisted on changing this behavior to produce a tensor of rank ndims(A)-ndims(d)
. Doing A .+ A[1,:]
would, however, definitely break.
I also think that listing use cases would be a good idea. I could imagine that e.g. dropping only trailing singleton dimensions doesn't come up very often; when you want to preserve your dimensions you want to preserve all of them.
I'm also not so sure that it would be very good to implement my absent dimensions idea :) especially in full generality. But yes, the thought was that an array with only absent dimensions could be unwrapped into a scalar, much like we sometimes unwrap results that would have been 0-dimensional arrays right now.
I think about this more over the weekend. I am convinced by @timholy's argument that one can write a[i:i, :]
to preserve dimensions when he wants. I am fine with dropping all dimensions corresponding to integer indices.
i:i
may be fine to preserve dimensions but that will quickly get annoying when you want to index with something even slightly complex and/or verbose. E.g. (frobozz + 2 gazonk):(frobozz + 2 gazonk) is no fun at all.
I think the thing to do is use a[[i],:]
as the idiom because there's no repetition of substantive content. I.e. you just write a[[frobozz + 2gazonk],:]
and it's still just two more characters (and the amortized typing overhead is even lower).
But the runtime overhead is higher since we will have to allocate a Vector
to store the index in, right? Since ranges are immutable, they could avoid this overhead (and with enough inlining, even avoid the overhead due to duplication).
@StefanKarpinski when using a[range, :]
we know statically that it is going to be a strided view. However, for a[vector, :]
we cannot make sure of this statically, ([1]
and [1,4,8]
are of the same type) and have to fallback to less efficient view types. Also, as @toivoh, [1]
leads to unnecessary construction of an array. Therefore, a[[i], :]
is not a very recommended way especially when efficiency is important.
We may introduce an inline function to create i
from i:i
. So people won't have to write the index twice. I can't come up with a proper name that is also short though.
You're totally right about the type inference problem there. I can get behind the a[[i],:]
syntax, but a[i:i,:]
syntax is just too nasty, not to mention the repetition of the substance. I guess a[v(i),:]
or something like that would be possible, but this is really not feeling like a fun thing to have to write all the time. So far, the objections to dropping trailing singleton slices don't strike me as strong enough to warrant this kind of inconvenience. I'd like to see that list of problems that NumPy/APL style slicing solves versus the problems that it creates.
Wouldn't it be possible to special-case [x]
when x
is a scalar so that it can be inlined and eventually completely skipped when doing a[[x], :]
?
The [x]
syntax is a bit special since it's at the intersection of the [a,b,c]
and [a;b;c]
syntaxes.
@StefanKarpinski But in the case where x
is a single scalar, it's pretty simple, isn't it?
@timholy I'm definitely happy to help in the changing tests! (I would really like this change. :+1: )
Do we have the @
token to play with? Something like @i
or @(expression)
would be one char in most cases and may have a relatively obvious meaning... :S
That syntax is not available – @i
is the invocation of the i
macro.
It seems likely that 0.4 will see a transition to array views (#5556). This change will surely break some code, e.g., for anyone who was relying on the behavior of
setindex!
applied to the output ofA[3:15, 7]
. Perhaps we should take the occasion to simultaneously think about another (much worse) breakage.As we've gotten less concerned about breaking Matlab compatibility (and as I've learned more about the wider scientific ecosystem), I've increasingly begun to wonder whether NumPy's slicing rules are the right way to go. It's very simple: if you index along a particular dimension with an
AbstractVector
, that dimension is retained; if you index with a scalar, that dimension is dropped. So for a 2d arrayA
,A[:,:]
produces 2d output,A[3,:]
andA[:,3]
both produce 1d output, andA[3:3,:]
,A[:,3:3]
both produce 2d output. Note this scheme is type-stable, so AFAICT there are no performance gotchas.But wow would it be a breaking change.