JuliaLang / LinearAlgebra.jl

Julia Linear Algebra standard library
Other
18 stars 4 forks source link

Arraypocalypse Now and Then #255

Closed mbauman closed 7 years ago

mbauman commented 9 years ago

This issue supersedes the 0.4 work towards array nirvana (#7941), and will tracks the issues we aim to complete during 0.5 and beyond — now updated through work on 0.7. This is an umbrella issue and will track specific tasks in other issues. Please feel free to add things that I've missed.

Required underlying technologies

Major 0.5 breaking behavior changes

Major 0.6 breaking behavior changes

Possible future breaking changes

New functionality

Other speculative possibilities

tbreloff commented 9 years ago

This looks like a great list Matt, thanks. I'm a little scared of the fallout but it'll be a huge leap forward for the language.

On Tuesday, September 15, 2015, Matt Bauman notifications@github.com wrote:

This issue supersedes the 0.4 work towards array nirvana (#7941 https://github.com/JuliaLang/julia/issues/7941), and will track the issues we aim to complete during 0.5. This is an umbrella issue and will track specific tasks in other issues. Please feel free to add things that I've missed (at least during the first half of 0.5).

Required underlying technologies

Breaking behavior changes

New functionality

Other speculative possibilities

  • A mutable fixed-size buffer type, which would allow for a Julia-native Array definition (#12447 https://github.com/JuliaLang/julia/issues/12447)
  • Base IndexSet on BitArray or perhaps any AbstractArray{Bool}.
  • Rework nonscalar indexing to prevent calling find on logical arrays and simply wrap it with an IndexSet instead?
  • Negated indexing with complement IndexSet? (Perhaps in a package)

— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/LinearAlgebra.jl/issues/255.

JeffBezanson commented 9 years ago

Yes, great list! Let's roll up our sleeves!

tbreloff commented 9 years ago

Are there any pieces of this that are unclaimed? I'd like to help, but I don't want to step on anyone's toes.

On Tuesday, September 15, 2015, Jeff Bezanson notifications@github.com wrote:

Yes, great list! Let's roll up our sleeves!

— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/LinearAlgebra.jl/issues/255.

StefanKarpinski commented 9 years ago

It's a phenomenal list. There's actually only a few changes that are very breaking, which is nice, but those are significant enough.

mbauman commented 9 years ago

There is definitely more than enough work to go around! I don't think any of these tasks are claimed.

Things like dropping scalar dimensions are rather simple changes, but will take a lot of work finding and fixing bugs... and that part is easy to collaborate on. Same goes for views (if you ignore the perf issues with ReshapedArrays and inbounds). Anyone is welcome to dig in!

jakebolewski commented 9 years ago

Views is hard, you have to make it through bootstrap without :hocho: yourself in the :eyes:.

StefanKarpinski commented 9 years ago

Having just done a bunch of work to get string changes through bootstrap, I'm inclined to believe this.

milktrader commented 9 years ago

Thanks for doing this @mbauman, so much to digest

jiahao commented 9 years ago

I've added "Remove default no-op behavior for (c)transpose" as an item. I expect much complaining, but as we've discussed before, it's simply wrong to assume that <:Any is a scalar and the logic error rears its head every time one tries to wrap and/or implement custom array/matrix types. cc @jakebolewski @andreasnoack

StefanKarpinski commented 9 years ago

I think we need to think through the options for that carefully. It's pretty idiomatic to write A' to transpose a non-complex matrix.

mbauman commented 9 years ago

Isn't it possible that the (c)transpose wrapper will solve this issue? There's a lot of design work that will need to go into it, but:

transpose(A::AbstractVectorOrMatrix) = TransposeType(A) # Will call `transpose` upon indexing, too
transpose(::AbstractArray) = error("cannot take transpose of 3+ dim array") # yeah, error methods aren't the best…
transpose(x) = x
carnaval commented 9 years ago

related: I think some of the typing problems in linalg (and transpose behavior) comes from the fact that we represent a blocked linear operator as an array of arrays. We may want to switch to a new type that knows of the various sizes inside it for that, I remember discussing that with @andreasnoack. 0.5 might be a time to at least think about it.

jiahao commented 9 years ago

Isn't it possible that the (c)transpose wrapper will solve this issue?

Maybe; we'd have to think about it.

The blocking issue last time is that transpose has to be recursive to handle cases like Matrix{Matrix} (Example: A = [rand(1:5, 2, 2) for i=1:2, j=1:2]; A') correctly, but people want to write A' on 2D arrays on non-numeric types (e.g. images, as Matrix{<:Colorant}) and expect the transpose to not apply to the scalar elements. The no-op transpose(x::Any) method exists to handle these cases. However, this definition conflicts with matrix-like objects, which have the algebraic semantics of matrices but are not stored internally in any array-like form, and hence by JuliaLang/julia#987 should not be an AbstractArray (QRCompactWYQ is the poster child, but we have many such examples). If you introduce a new matrix-like type, you have to explicitly define (c)transpose otherwise you get the no-op fallback which is a source of many bugs.

To be clear, the behavior we would break explicitly is the claim (which you can find in the help for permutedims) that

Transpose is equivalent to permutedims(A, [2,1]).

This equivalence makes no sense for types that are not AbstractArrays and that are AbstractArrays of non-scalars, and we actually do have matrix-like types that need this more abstract sense of transpose.

tbreloff commented 9 years ago

I think assuming that a Matrix{Matrix} will automatically recursively transpose the elements is bad and dangerous. I'd rather see a special type BlockMatrix{Matrix} that does what you're looking for.

On Wed, Sep 16, 2015 at 10:30 AM, Jiahao Chen notifications@github.com wrote:

Isn't it possible that the (c)transpose wrapper will solve this issue?

Maybe; we'd have to think about it.

The blocking issue last time is that transpose has to be recursive to handle cases like Matrix{Matrix} (Example: A = [rand(1:5, 2, 2) for i=1:2, j=1:2]; A') correctly. However, people want to write A' on 2D arrays on non-numeric types (e.g. images, as Matrix{<:Colorant}) and expect the transpose to not apply to the scalar elements. The no-op transpose(x::Any) method exists to handle these cases. However, this definition conflicts with matrix-like objects, which have the algebraic semantics of matrices but are not stored internally in any array-like form, and hence by JuliaLang/julia#987 https://github.com/JuliaLang/julia/issues/987 should not be an AbstractArray (QRCompactWYQ is the poster child, but we have many such examples). If you introduce a new matrix-like type, you have to explicitly define (c)transpose otherwise you get the no-op fallback which is a source of many bugs.

To be clear, the behavior we would break explicitly is that

Transpose is equivalent to permutedims(A, [2,1]).

would now be false. This equivalence makes no sense for types that are not AbstractArrays, and we actually do have matrix-like types that need this more abstract sense of transpose.

— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/LinearAlgebra.jl/issues/255.

jiahao commented 9 years ago

@tbreloff that's exactly my point. Most people think it's odd that transpose should be recursive, but it must be to be a mathematically correct transpose, and that disquiet exposes corner cases where transposition is not simply permutedims(A, [2,1]). (Although it’s true that Matrix{Matrix}} is not really a blocked matrix type, because there are absolutely no guarantees that the inner matrices have dimensions that are consistent with any partitioning of a larger matrix.)

mbauman commented 9 years ago

Ah, yes, I forgot about all the Tensor-like objects that aren't AbstractArrays. No matter what happens, either the authors of Tensor-like objects will need to communicate to Julia somehow that they're not scalars (sometimes being an AbstractArray works, but not always), or the authors of Scalar-like objects will need to do the reverse (sometimes being a Number works, but not always), or both. This same sort of scalar-or-not question rears its head all over the place… e.g., indexing: https://github.com/JuliaLang/julia/pull/12567.

Right now we require a mishmash of method specializations for the tensors with some scalar-like fallbacks. This means that some get forgotten and we end up with scalar methods getting called, returning the wrong result.

Since we can't communicate this through supertypes (https://github.com/JuliaLang/julia/issues/987#issuecomment-31531602), I think it's gotta either be through better documentation of the required methods or having those methods explicitly encoded into a traits-based system. And if we remove all fallbacks (which ensures correct behavior at the cost of missing methods for everyone), I think we need to make this as simple as possible with traits.

johnmyleswhite commented 9 years ago

+1

I really think we should start enumerating traits for AbstractArray's. Linear indexing seems to have been an amazing example of the power of a few careful design decisions involving traits.

mbauman commented 9 years ago

One possible solution would be to keep AbstractArray <: Any, and introduce AbstractNonArrayTensor <: Any alongside it. Everything else would be considered "scalar" as far as indexing and linear algebra are concerned.

Note that this is distinct from and much more well-defined than an Atom vs. Collection distinction (https://github.com/JuliaLang/julia/pull/7244#issuecomment-46333261); A[:] = (1,2,3) and A[:] = "123" behave very differently from A[:] = 1:3 for A = Array(Any, 3), as they should.

ScottPJones commented 9 years ago

I really think we should start enumerating traits for AbstractArray's.

@johnmyleswhite By any chance did you mean language supported "traits"? That's one thing I've really wanted to see in the language since JuliaCon.

johnmyleswhite commented 9 years ago

Yes, I meant language supported traits.

ScottPJones commented 9 years ago

Do you have any ideas/suggestions about how traits could be added to Julia, syntactically? They would be very useful for arrays, strings, encodings, at the very least.

johnmyleswhite commented 9 years ago

I prefer leaving those decisions to Jeff rather than speculating.

ViralBShah commented 9 years ago

Given that this is an umbrella issue, it would be nice to discuss specific items in their own issues.

ScottPJones commented 9 years ago

I do think though that having traits in the language might substantially change the design for Arrays, which is why discussion of traits, at least in the area of how they could be used for better Array abstractions, would be useful.

mbauman commented 9 years ago

Please move traits discussion to JuliaLang/julia#5, and the (c)transpose issue to JuliaLang/julia#13171.

StefanKarpinski commented 9 years ago

I don't think the syntax for traits needs to be figured out before figuring out what traits are important for arrays. In fact, having more examples of traits that we actually need is excellent for helping design the language feature.

ScottPJones commented 9 years ago

Ok, good point, as long as traits are being thought of as part of the design.

ScottPJones commented 9 years ago

Upper and lower for triangular matrices? Those seem like good candidates for being done as traits to me.

StefanKarpinski commented 9 years ago

Added speculative item for replacing indexing into a type as syntax for typed arrays.

ViralBShah commented 9 years ago

I would prefer not to move sparse arrays out of base in 0.5. I think we should aim to have SparseVectors in Base for 0.5, and make sure that there is one really high quality default implementation.

As we move some of the other functionality out of base, we can move out the various sparse solvers first. If alternative sparse data structures are shaping up nicely at that point in JuliaSparse, we can eventually move SparseMatrixCSC there too.

ViralBShah commented 9 years ago

How about a small housecleaning task here as well - move all the array code into its own subdirectory and module in Base?

ViralBShah commented 9 years ago

There are a bunch of things marked as milestone 0.5. I would like to suggest that the issues referenced here should be marked as 0.5 blocker, and the rest that are 0.5 are essentially 0.5 nice to have. We could then have a 1-2 week time period for accumulating a complete list of 0.5 blocker issues, and release 0.5 as soon as the blockers are all done. Any nice to have issues that get fixed in the same timeframe will make there way into 0.5, but the release won't block on any nice to have feature that is not ready.

andyferris commented 9 years ago

I'd like to add a few additional possibilities. I'm extremely excited about Arraypocolypse!

First, I think it would make sense to have operators for tensor sum and tensor product. These could be ++ and **, or could be unicode symbols \otimes and \oplus. Tensor product can replace kron(). Tensor sum gives concatenation of vectors and, and as aside, it is a great description of string concatenation (there has been many, many conversations about this, and the core Julia developers seem worried that + corresponds to the incorrect monoid here - and so go to * because it possibly seems less commutative - while I would strongly argue the correct operation is tensor sum \oplus which is also not commutative). These can also be used for higher-order arrays/tensors (see below) and perhaps (I'm speculating here) relational algebra operations on DataFrames?.

Second, it has occurred to me that Julia may be really, really good at doing multi-linear algebra with extremely nice syntax. I have been playing with a package for assigning names to each index as an array as a part of the type using a Tuple{...} type. An array is turned into a tensor by indexing it like A[Tuple{:index1,:index2}] and if Julia 0.5 uses {} for Tuple{} then it's much improved to A[{:index1,:index2}] and we get the following awesomeness:

  1. Permutation: A[{1,2,3}] = B[{3,2,1}]
  2. A new way of doing matrix and/or vector multiplication, A[{:a}] = B[{:b,:a}] * C[{:b}] (contracts over :b index, so A = B' * C).
  3. Similarly, arbitrary and higher-order tensor contraction (a.k.a einsum): e.g. A[{:a,:b}] = B[{:a,:c,:d}] * C[{:c,:e}] * D[{:d,:f}] * E[{:e,:f:,:b}]. Brackets can be used to choose the fastest contraction ordering.
  4. Tensor sum and tensor product make sense here also - so people can take outer products of higher order arrays and put the indices in the desired order easily, for instance.

The permutation syntax may solve another issue - transpose is a linear algebra thing, but since it is so simple to type ' it is used instead of permutedims to, e.g. reflect an image (a 2D array of pixel values). These different use cases seem to be causing an issue as to whether transpose is called recursively, but I would argue this "tensor" syntax for permutedims is clear enough for use in general programming and data/array manipulation.

I'm not sure if people would like this approach to multi-linear algebra (it relies on generated functions quite a bit) but I would welcome feedback!

ScottPJones commented 9 years ago

Could you elaborate a bit on tensor sum and vector ( and string) concatenation? I thing it would be great to have a "generic" concatenation operator, which the mathematicians also like.

andyferris commented 9 years ago

The tensor sum (or direct sum of v = [v_1,v_2,v_3] and w = [w_1, w_2, w_3] is written v \oplus w = [v_1,v_2,v_3,w_1,w_2,w_3]. (imagine I'm using LaTeX - \oplus and \otimes works in the Julia REPL already and look like + (or times) drawn with a circle around them).

The tensor/direct sum truly is the mathematical way of concatenating vectors (like vcat) and I would suggest strings also. For matrices A \oplus B is the bigger, block-diagonal matrix [A 0; 0 B], (and is a good candidate to be represented by the BlockedArrays mentioned further above). For transposed (row) vectors, the tensor sum is again just the concatenation given by hcat, so this would have to be treated differently depending if you had a row vector or a 1-by-n matrix (which hopefully should be OK in Julia 0.5).

So if we think of strings as vectors of characters, then direct/tensor sums are the way to go if the core developers are going to use words like "monoid" as a guiding syntax principle. Can string manipulation be thought of as a monoidal category where parallel arrows are concatenations?? Certainly, multilinear algebra over real (or complex) numbers is given by a (dagger) symmetric bi-monoidal categories where tensor/direct sum (\oplus) and tensor/outer/direct product (\otimes) are the two "monoids".

jiahao commented 9 years ago

@andyferris please open a new issue or PR for tensor concatenation. This issue is long and complicated enough without throwing in more features.

The relationship between multidimensional arrays and multilinear algebra was discussed in the APL literature of the 1970s, notably by Trenchard More and others working on APL2.

timholy commented 9 years ago

Yes, this deserves a new issue, this is too much of a derailment from the main work here.

(@andyferris, see also https://github.com/Jutho/TensorOperations.jl, https://github.com/mbauman/AxisArrays.jl, https://github.com/timholy/AxisAlgorithms.jl.)

cartazio commented 9 years ago

I'd like to share with y'all some code I did a while ago that I think handles hits one point in the design space for array operations that reflect locality and support sparse as first class. Porting it over would require traits I suspect, but you may find some of the ideas useful https://github.com/wellposed/numerical/tree/master/src/Numerical/Array/Layout the layout modules specifically

andyferris commented 9 years ago

Thanks @timholy for those links; I wasn't aware of the last two.

ScottPJones commented 9 years ago

How about a small housecleaning task here as well - move all the array code into its own subdirectory and module in Base?

@ViralBShah That sounds like a very good idea - also move array unit tests so there is a nice 1-1 correspondence between implementation and test files.

ViralBShah commented 9 years ago

Should we get the ball rolling here by flipping the switch on the concatenation deprecation (#8599)?

StefanKarpinski commented 9 years ago

+1

andreasnoack commented 9 years ago

@mbauman has a branch where most of the transpose changes have been implemented. @jiahao will correct me if I'm wrong here, but the new behavior of tranpose can be separated into a matrix transpose and a vector transpose where the former is less controversal so to make some progress here and start the process of adjusting code, I'm proposing that we start with the matrix transpose type only and postpone the vector tranpose a bit. @mbauman Do you think that would be feasible and do you have the time to make this separation if we decide that it's a useful way to proceed?

mbauman commented 9 years ago

Implementing the transpose types themselves is rather simple — just a few dozen LOC. And the matrix and vector transpose types themselves aren't intrinsically coupled (except that making matrix transposes views without changing vectors would be even more strange than the status quo).

The most difficult part about the direction I headed in mb/transpose is that I also tried to remove the special lowering to Ax_mul_Bx. That part is a ton of work, with each operator containing hundreds of permutations of transpose types, special matrix types, and BLAS-compatible element types. I didn't quite make it to the end of the tunnel for mul, and I didn't even start tackling the other operators in that branch. Removing the special lowering is also the point where we need to make a decision about the vector transpose.

That said, there's no need for these two changes to be coupled to each other. The transpose type allows for the removal of the Ax_op_Bx special lowering steps, and even simplifies some of that mess, but removing Ax_op_Bx isn't necessary in order to introduce transpose views. I think that the path forward here is to make smaller steps. While it was very informative to try tackling both together (it works! and it can make things simper!), it's too much to bite off at one time.

I think it should be possible to add simple fallbacks, e.g., *(A::Transpose, B::AbstractArray) = At_mul_B(untranspose(A),B), in order to get transpose views working with the current system without touching hundreds of methods. My time is very limited these days, so I cannot promise I'll find time to do this myself. But I think this could be done in a relatively small PR. Anyone is welcome to pick up the torch here.

StefanKarpinski commented 9 years ago

Couldn't we start by defining the Ax_mul_Bx methods in terms of the transpose types?

On Thursday, November 5, 2015, Matt Bauman notifications@github.com wrote:

Implementing the transpose types themselves https://github.com/JuliaLang/julia/blob/335200c142e368cad9aba885df5539d2755195ad/base/transpose.jl is rather simple — just a few dozen LOC.

The most difficult part about the direction I headed in mb/transpose is that I also tried to remove the special lowering to Ax_mul_Bx. That part is a ton of work, with each operator containing hundreds of permutations of transpose types, special matrix types, and BLAS-compatible element types. I didn't quite make it to the end of the tunnel for mul, and I didn't even start tackling the other operators in that branch. Removing the special lowering is also the point where we need to make a decision about the vector transpose.

That said, there's no need for these two changes to be coupled to each other. The transpose type allows for the removal of the Ax_op_Bx special lowering steps, and even simplifies some of that mess, but removing Ax_op_Bx isn't necessary in order to introduce transpose views. I think that the path forward here is to make smaller steps. While it was very informative to try tackling both together (it works! and it can make things simper!), it's too much to bite off at one time.

I think it should be possible to add simple fallbacks, e.g., *(A::Transpose, B::AbstractArray) = At_mul_B(untranspose(A),B). My time is very limited these days, so I cannot promise I'll find time to do this myself. But I think this could be done in a relatively small PR. Anyone is welcome to pick up the torch here.

— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/LinearAlgebra.jl/issues/255.

davidanthoff commented 8 years ago

How are things coming along with Arraypocalypse? Is the original plan to have a 0.5 release by the end of Jan with all of the things above in it still on the books?

timholy commented 8 years ago

Lots of other things are happening in 0.5, but AFAICT little is happening on arraysthis particular list. Some of us who have contributed a lot in the past to Julia's arrays are busy (in part, reaping the benefits of our previous work :wink:). To make this happen, it's possible others will have to step up to the plate (hint). But the just-posted JuliaLang/julia#14474 is perhaps the most exciting thing on this front to come in a long time---in many ways, it's THE crucial bottleneck.

If other stuff is ready to go, we could delay arrays to a later release.

ViralBShah commented 8 years ago

Yeah, lots of other exciting work going on, but this list is a little slow with progress. My thought is that we review this when the debugger is in good shape, perhaps around end of Jan. We have a new LLVM, thread safety, fast closures, Juno-Atom and many other goodies coming up. I do hope we can also get the array views in, if not this whole list.

davidanthoff commented 8 years ago

Here is one vote for delaying 0.5 until all the breaking changes around arrays are implemented. My understanding is that APL style slicing (which breaks backwards compatibility) is already on master, so package authors will have to update a large amount of code no matter what for 0.5. If the other major change (returning views instead of copies from []) would only come in some post-0.5 julia release, everyone would have to review/update all the same lines of code again, which would really be a LOT of extra work that could be entirely avoided if both of these breaking changes were delivered in one release.

Plus, from a users perspective, it would be great if the major breaking changes in the language would come rather earlier than later... From my point of view it would be much preferable to have things like a debugger, ARM compatibility, faster closures, threads etc. all come after the breaking language changes are done. All the code I write these days will have to be updated once these breaking changes land in master, whereas all the other good things mentioned would (most likely) not affect any of the code people are writing right now... The "looming" things like Arraypocalypse are the major reason a lot of people I know are staying away from julia for now...

Of course I have no insight whether there is anyone left who has the capacity to work on this, it seems some of the people that drove this earlier have moved on... But maybe Julia Computing can dedicate some resources to finishing this job?

In any case, could someone update the due date on the 0.5 milestone? At least if my interpretation of @ViralBShah comment above is correct, there won't be a julia 0.5 release on 1/31, which is the current information associated with the 0.5 milestone, right?

ViralBShah commented 8 years ago

This is not really an issue about dedicating resources. There are a fair number of open questions, and many folks still have apprehensions about array views. We'll have to do it and try it out, and see how it feels. As mentioned above, @blakejohnson 's PR is the first step and let's get that merged. There is certainly no plan to release 0.5 on 1/31. I am removing that date.