Closed zygmuntszpak closed 4 years ago
Thanks, interesting.
I note you are doing matrix * vector'
. I'm curious what you want this for? When I first worked on the vector transpose stuff I didn't really want to even support that as valid linear algebra.
But Julia does tend to be forgiving with regards to the singleton dimensions, we should support this here becase LinearAlgebra
does. I guess we just forgot a method for this.
I think the only time this occurs is when you have one entity that happens to be a column vector (single column of matrix), and the other a row vector, and you are computing the outer product of these two vectors. This arose when I was implementing a particular cost function. I've provided a small example below. I think I've also figured out which additional dispatch rules to add in order to address this. However, I've stopped short of a pull-request because I'm baffled by the fact that when I am summing the problematic term in my cost function (for which my proposed fix now returns a StaticArray) I am encountering a lot of inexplicable allocations. I'm not sure if this is a different bug, or of it is due to the fact that I haven't done enough in my proposed fix.
The proposed fix involves adding these two lines to matrix_multiply.jl
@inline *(A::StaticMatrix, B::Adjoint{<:Any, <:StaticVector}) = *(reshape(A, Size(Size(A)[1],)), B)
@inline mul!(dest::StaticVecOrMat, A::StaticMatrix, B::Adjoint{<:Any, <:StaticVector}) = mul!(dest, reshape(A, Size(Size(A)[1],)), B)
Here is an example that appears to solve the problem, but doesn't quite work because of a lot of inexplicable allocations.
using StaticArrays, BenchmarkTools, LinearAlgebra
function hom(v::SVector)
push(v,1)
end
function T(๐::AbstractArray, ๐::Tuple{AbstractArray, Vararg{AbstractArray}}, ๐::Tuple{AbstractArray, Vararg{AbstractArray}})
โ = kron
l = length(๐)
๐โ = SMatrix{l,l}(1.0I)
๐โ = SMatrix{1,1}(1.0I)
๐ = @SMatrix zeros(l,l)
N = length(๐[1])
โณ, โณสน = ๐
ฮโ, ฮโ = ๐
๐ฒโ = @MMatrix zeros(4,4)
๐โ = @SMatrix [1.0; 0.0; 0.0]
๐โ = @SMatrix [0.0; 1.0; 0.0]
for n = 1: N
index = SVector(1,2)
๐ฒโ[1:2,1:2] .= ฮโ[n][index,index]
๐ฒโ[3:4,3:4] .= ฮโ[n][index,index]
๐ฆ = hom(โณ[n])
๐ฆสน= hom(โณสน[n])
๐โ = (๐ฆ โ ๐ฆสน)
โโ๐ฎโ = [(๐โ โ ๐ฆสน) (๐โ โ ๐ฆสน) (๐ฆ โ ๐โ) (๐ฆ โ ๐โ)]
๐โ = โโ๐ฎโ * ๐ฒโ * โโ๐ฎโ'
๐บโ = ๐' * ๐โ * ๐
๐บโโปยน = inv(๐บโ)
๐โ = @SMatrix zeros(Float64,l,l)
for k = 1:l
๐โ = ๐โ[:,k]
โ๐โ๐บโ = (๐โ โ ๐โ') * ๐โ * (๐โ โ ๐) + (๐โ โ ๐') * ๐โ * (๐โ โ ๐โ)
# Accumulating the result in ๐โ allocates memory, even though
# the two terms in the summation are both SArrays.
๐โ = ๐โ + ๐โ * ๐บโโปยน * (โ๐โ๐บโ) * ๐บโโปยน * ๐โ' * ๐ * ๐โ'
end
๐ = ๐ + ๐โ
end
๐
end
# Some sample data
N = 300
โณ = [@SVector rand(2) for i = 1:N]
โณสน = [@SVector rand(2) for i = 1:N]
ฮโ = [SMatrix{3,3}(Matrix(Diagonal([1.0,1.0,0.0]))) for i = 1:length(โณ)]
ฮโ = [SMatrix{3,3}(Matrix(Diagonal([1.0,1.0,0.0]))) for i = 1:length(โณ)]
F = @SMatrix rand(3,3)
๐ = (ฮโ,ฮโ)
๐ = (โณ, โณสน)
T(vec(F),๐,๐)
@btime T(vec($F),$๐,$๐) # 682.152 ฮผs (6002 allocations: 3.85 MiB
That's a beautiful example of code-as-mathematics (or is it mathematics-as-code?). The ASCII type names look out of place!
Your problem probably arises in the line
l = length(๐)
which very likely leads to uninferrable types for any StaticArray
which uses l
. If ๐
is statically sized that's fairly easily solved using the Length
trait.
Thank you!
I'm not entirely sure I understand what you mean by using the Length
trait. Are you suggesting that I added functions such as:
T(::Length{9}) = @SMatrix zeros(9,9)
T(::Length{3}) = @SMatrix zeros(3,3)
and then define
๐ = T(Length(๐))
instead of doing
l = length(๐)
๐ = @SMatrix zeros(l,l) ?
I don't think the issue is soley due to the use of l = length(๐)
and then using l
in StaticArray
. The reason why I don't think that is the sole issue is because I tried filling in literal values for all my StaticArrays
and I still ended up with allocations.
In particular,
using StaticArrays, BenchmarkTools, LinearAlgebra
function hom(v::SVector)
push(v,1)
end
function T(๐::AbstractArray, ๐::Tuple{AbstractArray, Vararg{AbstractArray}}, ๐::Tuple{AbstractArray, Vararg{AbstractArray}})
โ = kron
l = 9
๐โ = SMatrix{9,9}(1.0I)
๐โ = SMatrix{1,1}(1.0I)
๐ = @SMatrix zeros(9,9)
N = length(๐[1])
โณ, โณสน = ๐
ฮโ, ฮโ = ๐
๐ฒโ = @MMatrix zeros(4,4)
๐โ = @SMatrix [1.0; 0.0; 0.0]
๐โ = @SMatrix [0.0; 1.0; 0.0]
for n = 1: N
index = SVector(1,2)
๐ฒโ[1:2,1:2] .= ฮโ[n][index,index]
๐ฒโ[3:4,3:4] .= ฮโ[n][index,index]
๐ฆ = hom(โณ[n])
๐ฆสน= hom(โณสน[n])
๐โ = (๐ฆ โ ๐ฆสน)
โโ๐ฎโ = [(๐โ โ ๐ฆสน) (๐โ โ ๐ฆสน) (๐ฆ โ ๐โ) (๐ฆ โ ๐โ)]
๐โ = โโ๐ฎโ * ๐ฒโ * โโ๐ฎโ'
๐บโ = ๐' * ๐โ * ๐
๐บโโปยน = inv(๐บโ)
๐โ = @SMatrix zeros(Float64,9,9)
for k = 1:l
๐โ = ๐โ[:,k]
โ๐โ๐บโ = (๐โ โ ๐โ') * ๐โ * (๐โ โ ๐) + (๐โ โ ๐') * ๐โ * (๐โ โ ๐โ)
# Accumulating the result in ๐โ allocates memory, even though
# the two terms in the summation are both SArrays.
๐โ = ๐โ + ๐โ * ๐บโโปยน * (โ๐โ๐บโ) * ๐บโโปยน * ๐โ' * ๐ * ๐โ'
end
๐ = ๐ + ๐โ
end
๐
end
# Some sample data
N = 300
โณ = [@SVector rand(2) for i = 1:N]
โณสน = [@SVector rand(2) for i = 1:N]
ฮโ = [SMatrix{3,3}(Matrix(Diagonal([1.0,1.0,0.0]))) for i = 1:length(โณ)]
ฮโ = [SMatrix{3,3}(Matrix(Diagonal([1.0,1.0,0.0]))) for i = 1:length(โณ)]
F = @SMatrix rand(3,3)
๐ = (ฮโ,ฮโ)
๐ = (โณ, โณสน)
T(vec(F),๐,๐)
@btime T(vec($F),$๐,$๐) # 682.152 ฮผs (6002 allocations: 3.85 MiB
It is still allocating even though I don't use l
in a StaticArray
.
I tried @code_warntype T(vec(F),๐,๐)
which produced an output too long to list here. However, a particular line jumped out with Any
:
1230 โ %5524 = invoke Base.afoldl(%5506::typeof(*), %5517::SArray{Tuple{9,1},Float64,2,9}, %5475::Adjoint{Float64,SArray{Tuple{9},Float64,1,9}}, _2::SArray{Tuple{9},Float64,1,9}, %5476::Adjoint{Float64,SArray{Tuple{9},Float64,1,9}})::Any
I don't know how to parse that. Perhaps it is indeed related to the fact that I have omitted something from my proposed "fix":
@inline *(A::StaticMatrix, B::Adjoint{<:Any, <:StaticVector}) = *(reshape(A, Size(Size(A)[1],)), B)
@inline mul!(dest::StaticVecOrMat, A::StaticMatrix, B::Adjoint{<:Any, <:StaticVector}) = mul!(dest, reshape(A, Size(Size(A)[1],)), B)
not really a solution, but in case this particular issue is an obstacle for some project you are working on:
You can work around this issue (in case you know you have a column vector as a matrix) by using vec
.
A = @SMatrix rand(9,1)
B = SArray{Tuple{9},Float64,1,9}(1,2,3,4,5,6,7,8,9)
C = adjoint(B)
D = vec(A)*C
typeof(D) # -> SArray
this also works: (it restricts *
to matrices where the second dimension is of length 1)
@inline (Base.:*)(A::StaticArray{Tuple{N,1},<:Any,2}, B::Adjoint{<:Any,<:StaticVector}) where {N} = vec(A) * B
to your example. the culprint is Base.afoldl(::typeof(*), ...)
according to @code_warntype
. so a workaround (additional to the *
def above) is to use parenthesis
...
๐โ = ๐โ + (((๐โ * ๐บโโปยน) * (โ๐โ๐บโ)) * ๐บโโปยน) * ๐โ' * ๐ * ๐โ'
...
julia> @btime T(vec($F),$๐,$๐) # 682.152 ฮผs (6002 allocations: 3.85 MiB
355.293 ฮผs (0 allocations: 0 bytes)
9ร9 SArray{Tuple{9,9},Float64,2,81}:
12.4606 12.7101 9.54674 6.48859 7.60029 3.98854 8.99992 11.6833 0.0
11.0465 14.5946 9.6956 6.44432 9.02317 4.72583 9.0564 11.8339 0.0
21.9599 25.7667 18.1746 12.062 15.7895 7.82211 18.386 23.9433 0.0
10.1492 10.1904 9.10567 7.61871 9.55715 3.85434 7.88378 11.0412 0.0
9.03448 12.208 9.16016 7.37143 10.5531 4.5322 7.87816 10.9953 0.0
17.1791 20.2238 16.6752 13.7404 18.7066 7.20513 15.7584 21.9906 0.0
22.8134 22.3908 20.2338 13.6448 16.1804 8.51447 17.4098 23.062 0.0
19.7638 26.0583 20.4084 13.0173 18.3174 9.9884 17.2918 22.8682 0.0
38.6198 44.3518 37.7591 24.2729 31.979 16.2962 34.9386 46.1641 0.0
edit: i somehow completely missed that you diagnosed the afoldl
line already.
Thank you very much for the wonderful suggestions. Do you think that I have stumbled upon some limitation of the inference engine in Base
, or do you think this problem is still at the level of the StaticArrays
package? I will try to simplify the code further whilst still keeping this "bug".
I suspect the issue is the lack of proper inlining the n-ary operator *
when Adjoints of static arrays are involved. While it does work for simpler code examples, it doesn't work in your case. Maybe thats because the culprint is a Base method that has no @inline
annotation but still gets inlined in simple cases.
For example this hacky fix makes your original code work:
const StaticArrayLike = Union{StaticArray,Adjoint{<:Any,<:StaticArray}}
@inline (Base.:*)(A::StaticArrayLike, B, C, D...) = *(*(A,B), C, D...)
@inline (Base.:*)(A::StaticArrayLike, B::StaticArrayLike, C, D...) = *(*(A,B), C, D...)
@inline (Base.:*)(A::StaticArray{Tuple{N,1},<:Any,2}, B::Adjoint{<:Any,<:StaticVector}) where {N} = vec(A) * B
funny enough it only works if both the first and the second method are added to *
.
This is already fixed in 0.12.4 and #814 fixes more similar cases.
A product of two StaticArrays results in an allocated array. Here is a minimal working example: