Closed bsxfan closed 8 years ago
Ideally, we should at least support BLAS and LAPACK operations when either stride(M,1)
or stride(M,2)
is equal to one.
The latter case corresponds to row-major order, and will occur for matrices returned from external C code (e.g. NumPy arrays returned from Python). There is no reason it can't be handled efficiently in LAPACK as it is simply the transpose of the matrix, and one can always convert an operation on a matrix to an equivalent operation on its transpose.
Note that we should also make StridedMatrix
an abstract type, not a Union, so that other storage-compatible types can be defined for it (issue #2345).
I believe Viral mentioned recently that he, Jeff and I were tossing around the idea of just representing all Julia arrays with a data pointer + strides and then making all linear indexing operations (i.e. with ranges) return subarrays, which will actually simply be first-class array objects since they will use the same data pointer with different stride info. It might be a bit rocky during the transition, but collapsing the distinction between arrays and subarrays would ensure that everything worked smoothly for both (since they're the same thing).
I also really like the idea of doing more things lazily too. Jeff is convinced that no one would notice in most code if slices were views, which I suspect is true because most arrays go through a construction phase during which they are used as mutable containers and then transition into a "value phase" in which they are used as values much like numbers and are thus treated as immutable. Once you're taking slices, an array is probably being treated as immutable already.
I've been wanting a transition to data pointer + strides for some time, exciting to hear that there is interest in this.
Indeed. I have been blithely assuming that a subarray was a pointer into the original array and just discovered that is was a copy.
@stevengj wrote:
Ideally, we should at least support BLAS and LAPACK operations when either stride(M,1) or stride(M,2) is equal to one. The latter case corresponds to row-major order, and will occur for matrices returned from external C code (e.g. NumPy arrays returned from Python). There is no reason it can't be handled efficiently in LAPACK as it is simply the transpose of the matrix, and one can always convert an operation on a matrix to an equivalent operation on its transpose.
Is this always the case? Consider for example getrs!()
, which solves lufact(X)\B == X\B
, where (A,ipiv) = LU(X)
:
function getrs!(trans::BlasChar, A::StridedMatrix{$elty}, ipiv::Vector{BlasInt}, B::StridedVecOrMat{$elty})
Here trans
can effectively transpose X
, but is there any way of transposing B
?
I think there should be distinguished between the LAPACK interface and the factorizations here. It was a choice not to have abstract types or unions in the fields of the factorizations. For now, my imaginations only allows for changing the definitions in factorization.jl from StridedMatrix
to Matrix
.
@stevengj Not all LAPACK subroutines have a transpose argument. Isn't that necessary for efficient treatment of row-major arrays?
@andreasnoackjensen No, an explicit transpose argument is not required, because given most linear-algebra operations on A you can usually cheaply infer the corresponding operations on A'. (Obviously, this is trivially the case for the important case of symmetric A.)
For example, given the A=LDU factorization, you immediately have A'=U'DL', which is an LDU factorization of A' (similarly with A=LU modulo a little rescaling). The eigenvectors of A' are the left eigenvectors of A. Given the SVD A=USV' you immediately have the SVD A'=VS'U'. And so on.
Lots of people (including me) use LAPACK with row-major arrays using these tricks. (I think the LAPACKE C interface may implement many of these tricks, for example.)
@dmbates wrote:
Indeed. I have been blithely assuming that a subarray was a pointer into the original array and just discovered that is was a copy.
julia> P = ones(Int,4,4)
4x4 Int32 Array:
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
julia> S = sub(P,2:3,2:3)
2x2 SubArray of 4x4 Int32 Array:
1 1
1 1
julia> S[:]=0
0
julia> P
4x4 Int32 Array:
1 1 1 1
1 0 0 1
1 0 0 1
1 1 1 1
@bsxfan Thank you. What I was missing was the sub
function. I was just using indexing.
There is probably remaining issues with subarrays, but we should open specific issues about them as they are encountered. The specific issue qith QR has already been fixed.
factorization.jl
is currently undergoing some maintenance (see #3089), but here is another problem that has not been mentioned there. For convenience when reading this, recall that:I find both the documentation (Julia Manual) and code (factorzation.jl) is misleading w.r.t. StridedMatrix. Several in-place method signatures that suggest they handle StridedMatrix arguments, do not really do so (they crash). Mostly one does not notice this, because the in-place methods are invoked via a copy(), which converts SubArray to Matrix.
The Julia Manual claims:
"StridedVector and StridedMatrix are convenient aliases defined to make it possible for Julia to call a wider range of BLAS and LAPACK functions by passing them either Array or SubArray objects, and thus saving inefficiencies from indexing and memory allocation."
This is true only for matrices
M
such thatstride(M,1) ==1
. If a StridedMatrix withstride(M,1)>1
should reach Julia's LAPACK wrapper methods, they crash (but mostly this does not happen due to copying). In the BLAS case, sending a StridedMatrix withstride(M,1)>1
to gemm_wrapper will invokegeneric_matmatmul()
rather thanBLAS.gemm!
.Let's experiment with the example similar to the one in the manual (http://docs.julialang.org/en/latest/manual/arrays.html?highlight=subarray), which sends a SubArray to qr(). (Also see #3114 for a related problem).
So
G
is a 2x2, BLAS/LAPACK-friendlySubArray
. Notice thatSubArray
does provide a generalization ofMatrix
, because for a standard 2x2Matrix
, we would get:strides(randn(2,2))==(1,2)
.Now we also create a 2x2
SubArray
which cannot be handled by BLAS/LAPACK:What happens if we send
G
orB
to a matrix factorization? Well, it just copies the argument, thereby converting it to Matrix before doing the factorization, for example:This sort of defeats the whole prupose of having the StridedMatrix here.
Now let's try harder to send a SubArray to LAPACK. The signatures
suggest we can do so, but:
What has happened here is that
LAPACK.geqrt3!(G)
was in fact succesful, but has returned a SubArray, which cannot be accepted by the constructorQR(LAPACK.geqrt3!(A)...)
, because the QR type cannot in fact contain SubArrayss---it contains standard Matrices:The behaviour of other factorizations is similar. I'm sure a bit of refactorization can generalize the factorization types to contain StridedMatrix rather than Matrix. But it still seems a bit funny that StridedMatrix serves its purpose only in the case stride(M,1)==1. I'm tempted to suggest that SubArrays with stride(M,1)>1 should have a different type.