JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.41k stars 5.45k forks source link

Dispatch to generic matrix product rather than gemm for reshaped transposed arrays #30988

Open ettersi opened 5 years ago

ettersi commented 5 years ago
julia> n = 1000;
       @time rand(n,n) * reshape(rand(n,n)', (n,n))
       @time rand(n,n) * Matrix( reshape(rand(n,n)', (n,n)) )
       ;
  0.834937 seconds (20 allocations: 22.889 MiB, 1.68% gc time)
  0.022891 seconds (16 allocations: 30.518 MiB)

Profiling indicates that the difference in runtimes comes about because the first call dispatches to _generic_matmul while the second dispatches to gemm!.

Related issues:

andreasnoack commented 5 years ago

It's generally best to avoid ReshapedArrays since they almost always end up dispatching to very slow methods. I wouldn't like to add ReshapedArrays methods for all functions that accept AbstractMatrix. Eventually, it might become possible to do something smarter with ReshapedArrays than fetching elements one-by-one but for now, the best thing you can do is to materialize a ReshapedArray.

ettersi commented 5 years ago

I think it would be justified to specialise * such that it materialises reshaped arrays before multiplying. It's a common enough operation, the performance penalty is huge, and figuring out that reshape(transpose(A), (n,n)) is the problem is rather tricky (both reshape(A, (n,n)) and transpose(A) on their own would be fine).

In case anyone else wonders why Julia has a lazy ReshapedArray type that only seems to make things slower: reshape is required to maintain the link to the underlying data, so this function cannot materialise.

andreasnoack commented 5 years ago

I think it would be justified to specialise * such that it materialises reshaped arrays before multiplying.

This is a problem all over the place and I'm not sure reshaping a transposed array necessarily is the common place where this causes problems. Consider e.g. things like

julia> @time sum(spzeros(10000,10000))
  0.000014 seconds (12 allocations: 78.672 KiB)
0.0

julia> @time sum(vec(spzeros(10000,10000)))
  0.230635 seconds (17 allocations: 78.797 KiB)
0.0

and similar things happen for most array types.

reshape is required to maintain the link to the underlying data, so this function cannot materialise.

That was indeed the conclusion in #4211. However, I think it's worth reconsidering for Julia 2, see https://github.com/JuliaLang/julia/issues/4211#issuecomment-409526010

ettersi commented 5 years ago

I'm not sure I understand. Are you suggesting not to materialise ReshapedArrays because for some array types (e.g. sparse arrays) materialising them to dense arrays could make things more costly?

andreasnoack commented 5 years ago

No. I'm saying that the problems with ReshapedArrays are all over the place so either we could either add materializing methods for ReshapedArrays for any function that has an AbstractArray method or we could ask users to materialize them explicitly. I think the latter makes more sense.