drop dimensions indexed with a scalar?

timholy commented 10 years ago

It seems likely that 0.4 will see a transition to array views (#5556). This change will surely break some code, e.g., for anyone who was relying on the behavior of setindex! applied to the output of A[3:15, 7]. Perhaps we should take the occasion to simultaneously think about another (much worse) breakage.

As we've gotten less concerned about breaking Matlab compatibility (and as I've learned more about the wider scientific ecosystem), I've increasingly begun to wonder whether NumPy's slicing rules are the right way to go. It's very simple: if you index along a particular dimension with an AbstractVector, that dimension is retained; if you index with a scalar, that dimension is dropped. So for a 2d array A, A[:,:] produces 2d output, A[3,:] and A[:,3] both produce 1d output, and A[3:3,:], A[:,3:3] both produce 2d output. Note this scheme is type-stable, so AFAICT there are no performance gotchas.

But wow would it be a breaking change.

lindahua commented 10 years ago

Looking at this in isolation, it sounds fine to me. However, problems may arise if we consider it in conjunction with broadcasting.

Consider this:

A .+ A[1,:]

At the first glance, I would suppose that the behavior is to add the first row of A to each row. However, if we squeeze out the first dimension, this would fail.

To extend this little bit, shall we squeezed reduced dimension for reduction operation? What would be the shape of sum(A, 1)? When we want to make decision about this, I would invite people to consider the following example that represents a very common operation in practice:

A .- mean(A, dim)

timholy commented 10 years ago

It would simply become A .+ A[1:1,:]. Range-indexing wouldn't squeeze, Int-indexing would.

Another perspective on this: currently slice works this way, and sub never squeezes. If we're basically getting rid of those functions in preference to getindex, then slice's behavior may be the one we need to emulate, since sub doesn't give you any choices.

In my view, it's pretty clear that reductions shouldn't squeeze, but I recognize this is potentially contentious.

StefanKarpinski commented 10 years ago

See also https://github.com/JuliaLang/julia/issues/4774 and https://github.com/JuliaLang/julia/issues/3262. I kind of wish there weren't so many issues about basically the same thing.

StefanKarpinski commented 10 years ago

I would be happier with not dropping any dimensions and having slicing always return a tensor of the same rank as the tensor that is being sliced. Are there any downsides to that? Vectors would tend to get turned into column matrices, but that's fine, imo. We could also make [1;2;3] and the corresponding form with newlines construct a column matrices instead of vectors.

JeffBezanson commented 10 years ago

I think that would be ok; I suspect it's more important to formalize the embedding of N-d tensors in (N+1)-d tensors with a trailing singleton dimension. Then extra dimensions can generally be ignored, rather than explicitly dropped.

nalimilan commented 10 years ago

I'm also a big supporter of @timholy's proposal. Since in Julia you can distinguish scalars from vectors and ranges, it sounds natural to allow using this difference to indicate what the shape of the resulting slice should be.

StefanKarpinski commented 10 years ago

I have an unorthodox proposal: how about someone who's in favor of this go ahead and implement a Julia array type that has this indexing behavior. It doesn't have to be completely functional – I just think it would help a lot to have a mockup to play with. Then people (like me), who are skeptical of this idea can try it out and see if it's really as odious as it sounds ;-)

timholy commented 10 years ago

I opened this issue because I'm basically intending to do just that; this is the "shot across the bow" from someone with a dangerous track record :smile: when it comes to reworking much of Julia's array indexing. Like you, I don't actually know how this will work out, and I am kind of curious to know whether I will like it.

But my general feeling is that perhaps this should be timed with other disruptions, like switching to views---both will break code, this much more than views. That said, if you really would play with this well in advance of getting serious about merging it, then perhaps one doesn't need to wait.

timholy commented 10 years ago

I think the issue has served its purpose, and the next step is just to do it. So this can be closed.

timholy commented 10 years ago

@nalimilan, while I agree with you in that this proposal is logical and adds power---which is why I think it's worth exploring---the part I'm worried about is that it may be so much more common to want not to slice dimensions that I'll get annoyed at writing ranges all the time. Moreover, : has quite low precedence and gets used both for ranges and for conditionals (the x?a:b syntax), so this will inevitably add to the number of parentheses in code.

timholy commented 10 years ago

Actually, let me use this to ask one more thing: if I make the core changes to base/, anyone willing to take the lead in modifying the scripts in test/ to be consistent with the new behavior?

JeffBezanson commented 10 years ago

Should we go full APL and make the rank of the result the sum of the ranks of the indexes?

tshort commented 10 years ago

Instead of changing what's in base, can you add a new type? (Maybe that's what you plan.)

timholy commented 10 years ago

@JeffBezanson, can you point me to a link? I think I understand what you're asking, but not sure. I'd have to think about the implementation, but it sounds doable.

@tshort, I don't think it will be an effective experiment if we don't really try it. IF we do try it (and it's by no means too late to convince me otherwise), my plan would be to post a branch and people could play with it. I'm also mulling over whether there's a way to make the transition safely with an appropriate macro or two. It's not obvious there is, but I think it's worth careful consideration.

StefanKarpinski commented 10 years ago

I we do make this change, then it only makes sense to go "full APL", imo.

toivoh commented 10 years ago

Can someone provide a link on what APL indexing semantics actually are? Given @JeffBezanson's description, it seems more profound than just dropping singleton dimensions or not.

toivoh commented 10 years ago

Also, since there seem to be definite use cases for both dropping and keeping dimensions, how about that we provide both operations? This could even make the transition non-breaking.

Or does slice already do what is wanted? Could it? (I didn't even realize that we had it, so I'm not sure exactly what it does)

JeffBezanson commented 10 years ago

http://www.microapl.com/apl_help/ch_020_010_040.htm

lindahua commented 10 years ago

Agree that we may seriously consider adopting the APL rules, which are much more expressive.

timholy commented 10 years ago

Having looked at the rules briefly, yes, it sounds like a good idea to go all the way if we do this.

@toivoh, the new behavior would imitate slice:

julia> A = reshape(1:4, 2, 2)
2x2 Array{Int64,2}:
 1  3
 2  4

julia> A[1,:]
1x2 Array{Int64,2}:
 1  3

julia> sub(A, 1, :)
1x2 SubArray{Int64,2,Array{Int64,2},(Range1{Int64},Range1{Int64})}:
 1  3

julia> slice(A, 1, :)
2-element SubArray{Int64,1,Array{Int64,2},(Int64,Range1{Int64})}:
 1
 3

julia> slice(A, 1:1, :)
1x2 SubArray{Int64,2,Array{Int64,2},(Range1{Int64},Range1{Int64})}:
 1  3

What's your idea for making this non-breaking? I currently have what might be an effective, but somewhat complex, plan for migration. The key is to use deprecation on indexing operations whose behavior is changing, but provide users a mechanism for indicating that other indexing operations should follow the new behavior and therefore not be subject to the deprecation behavior:

getindex(A, i::Int, j::AbstractVector) would return a warning, but follow the old behavior (using @deprecate)
I or someone would have to create a new expression type, :newref, that gets converted into getindex_apl/setindex!_apl operations. This is the part I'm currently least sure about implementing.
Provide macros @ni and @newindex. The first is used on single expressions, @ni A[3,:] to create a :newref expression. The second wraps whole blocks of code and converts all :refs into :newrefs. The safest migration path would be to use @ni on an expression-by-expression basis until you're sure you've converted them all, then clear them all out by wrapping the entire thing in @newindex. And new code could start with @newindex right away, of course.

A bit ugly, but I'm not sure I see a good alternative.

timholy commented 10 years ago

I'm going to be occupied for the next several weeks with other tasks, and in any event this is 0.4 material. Let me make sure that folks noticed my request for collaboration on this issue, specifically on converting the files in /test to the new behavior: CC @isorber, @wenxgwen, @hayd, @BobPortmann, @nalimilan.

While I am curious to know how this indexing would work out in practice, much of my motivation for being willing to try it is because of the complaints about the current scheme. (I've implemented much of Julia's current array indexing, so I feel a certain amount of responsibility for handling complaints about how it works.) I'm reasonably happy with Julia's current behavior; if none of you are willing to share the pain of this transition, then I know I don't need to feel guilty about anything :smile: and just might find better uses for my time.

timholy commented 10 years ago

Looks like that should have been CC @lsorber.

timholy commented 10 years ago

Forgot @malmaud

lindahua commented 10 years ago

@timholy If we agree to change behavior (in whatever way), better to do it along with the array view framework. If we are going to deprecate sub-arrays in favor of array views (with the GenericView, it should 100% covers what a sub-array can do), then we can implement this behavior in array views, and delegate getindex to view.

malmaud commented 10 years ago

@timholy Full APL sounds good to me.

timholy commented 10 years ago

@lindahua, 100% agree that if we're going to make one breaking change in Arrays, it's the perfect time for a second. In fact I wonder about making 0.4 a quick release (like, 1 month) focused on breaking changes to the Array API. That way people don't have to have code hanging around for a long time with @newindex around everything (which will slow down parsing).

@malmaud, since you're one of several who has argued for this, what I was actually CCing you on was the request for a collaborator on implementation (see above). Was that a yes? :smile:

malmaud commented 10 years ago

@timholy Oh hah, sure. I'd be excited to help out with the APLification.

lindahua commented 10 years ago

I agree to focus on stabilizing the Array API in the 0.4 cycle.

Before we actually implement any thing or revise real codes, what about somebody put up together a specification of the array API (in a gist or somewhere)?

I would like to help, but don't have much time this week due to an upcoming deadline.

timholy commented 10 years ago

@malmaud, that would be great!

@lindahua, I don't have time for the next several weeks (mid/late March grant deadline). But after that I'll be ready to charge full steam ahead.

toivoh commented 10 years ago

It seems to me that full APL allows each of the indices to be an array of any dimension, but I still haven't seen a clarification on how it behaves when using more than one index of dimension greater than 1.

Also, the non-breaking alternative I was talking about was to keep getindex more or less as it is (maybe make it not drop trailing dimensions either?), and use slice when you want APL indexing. I'm not sure whether it's a good alternative, but I feel that it might deserve a little discussion.

StefanKarpinski commented 10 years ago

+1 for using slice for full APL indexing, plus either dropping trailing singletons like we currently do or absent dimensions.

timholy commented 10 years ago

I'm assuming it's supposed to act like reshape(A[ind1[:], ind2[:]], tuple(size(ind1)..., size(ind2)...)).

Your alternative deserves discussion. We'd want both getslice and setslice!, of course.

toivoh commented 10 years ago

Of course. And yes, that is what I would suppose as well.

timholy commented 10 years ago

I could be persuaded, though the right course is not obvious. I think the main argument the other way is that because the transition to returning views will be breaking, it's better to time any other breakages to go along with it. And it seems few know about or use slice now, even though it implements the kind of indexing under discussion. (It returns a SubArray, of course, and that may account for its lack of popularity.)

But if we're unsure whether we even want to go there, introducing getslice and setslice! would still be a step forward.

Other folks? It's an important decision.

nalimilan commented 10 years ago

I'm not sure what's the best solution between making getindex() drop dimensions and only making getslice() act this way. But I really don't like the idea of dropping trailing singleton dimensions and not others; this sounds really confusing and does not make much sense in my own practice.

Maybe we should make a list of use cases where dropping dimensions indexed by scalars is desirable or not, and see what's the most common case, and whether each behavior has cases in which it's really annoying?

Another solution would be to add a {} indexing operator which would call getslice(). That way it's easy to get the behavior you want.

StefanKarpinski commented 10 years ago

The { } syntax is already spoken for, so that's not viable, I'm afraid. Listing pro and con use cases seems like a good idea, although we can't really just count them up – there's still a huge amount of subjectivity about which ones are important.

I just realized that the "absent" dimension idea doesn't actually let you avoid dropping trailing singleton dimensions. For one thing, it's somewhat undesirable to have three things that are supposed to be column vectors: 1-d vectors, n x absent "vectors", and n x 1 matrices. For another thing, it wouldn't solve the v'*v problem – namely, you want this to be the dot product and produce a scalar, not an "absent x absent" tensor. I guess you could get that just be special-casing situations where the resulting tensor would have only absent dimensions as producing scalars.

StefanKarpinski commented 10 years ago

I also realized that since reductions like sum(A,d) and mean(A,d) already produce a tensor of the same rank as their input, A ./ sum(A,d) would continue to work correctly, regardless of slicing behavior, unless we also insisted on changing this behavior to produce a tensor of rank ndims(A)-ndims(d). Doing A .+ A[1,:] would, however, definitely break.

toivoh commented 10 years ago

I also think that listing use cases would be a good idea. I could imagine that e.g. dropping only trailing singleton dimensions doesn't come up very often; when you want to preserve your dimensions you want to preserve all of them.

I'm also not so sure that it would be very good to implement my absent dimensions idea :) especially in full generality. But yes, the thought was that an array with only absent dimensions could be unwrapped into a scalar, much like we sometimes unwrap results that would have been 0-dimensional arrays right now.

lindahua commented 10 years ago

I think about this more over the weekend. I am convinced by @timholy's argument that one can write a[i:i, :] to preserve dimensions when he wants. I am fine with dropping all dimensions corresponding to integer indices.

GunnarFarneback commented 10 years ago

i:i may be fine to preserve dimensions but that will quickly get annoying when you want to index with something even slightly complex and/or verbose. E.g. (frobozz + 2 gazonk):(frobozz + 2 gazonk) is no fun at all.

StefanKarpinski commented 10 years ago

I think the thing to do is use a[[i],:] as the idiom because there's no repetition of substantive content. I.e. you just write a[[frobozz + 2gazonk],:] and it's still just two more characters (and the amortized typing overhead is even lower).

toivoh commented 10 years ago

But the runtime overhead is higher since we will have to allocate a Vector to store the index in, right? Since ranges are immutable, they could avoid this overhead (and with enough inlining, even avoid the overhead due to duplication).

lindahua commented 10 years ago

@StefanKarpinski when using a[range, :] we know statically that it is going to be a strided view. However, for a[vector, :] we cannot make sure of this statically, ([1] and [1,4,8] are of the same type) and have to fallback to less efficient view types. Also, as @toivoh, [1] leads to unnecessary construction of an array. Therefore, a[[i], :] is not a very recommended way especially when efficiency is important.

lindahua commented 10 years ago

We may introduce an inline function to create i from i:i. So people won't have to write the index twice. I can't come up with a proper name that is also short though.

StefanKarpinski commented 10 years ago

You're totally right about the type inference problem there. I can get behind the a[[i],:] syntax, but a[i:i,:] syntax is just too nasty, not to mention the repetition of the substance. I guess a[v(i),:] or something like that would be possible, but this is really not feeling like a fun thing to have to write all the time. So far, the objections to dropping trailing singleton slices don't strike me as strong enough to warrant this kind of inconvenience. I'd like to see that list of problems that NumPy/APL style slicing solves versus the problems that it creates.

nalimilan commented 10 years ago

Wouldn't it be possible to special-case [x] when x is a scalar so that it can be inlined and eventually completely skipped when doing a[[x], :]?

StefanKarpinski commented 10 years ago

The [x] syntax is a bit special since it's at the intersection of the [a,b,c] and [a;b;c] syntaxes.

nalimilan commented 10 years ago

@StefanKarpinski But in the case where x is a single scalar, it's pretty simple, isn't it?

hayd commented 10 years ago

@timholy I'm definitely happy to help in the changing tests! (I would really like this change. :+1: )

Do we have the @ token to play with? Something like @i or @(expression) would be one char in most cases and may have a relatively obvious meaning... :S

StefanKarpinski commented 10 years ago

That syntax is not available – @i is the invocation of the i macro.

JuliaLang / julia

drop dimensions indexed with a scalar? #5949