JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.65k stars 5.48k forks source link

Transpose and Adjoint multiplication #28772

Open Nosferican opened 6 years ago

Nosferican commented 6 years ago
julia> versioninfo()
Julia Version 1.0.0
Commit 5d4eaca0c9 (2018-08-08 20:58 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, haswell)

julia> a = transpose([-6.5])
1×1 LinearAlgebra.Transpose{Float64,Array{Float64,1}}:
 -6.5

julia> b = [0, .95]'
1×2 LinearAlgebra.Adjoint{Float64,Array{Float64,1}}:
 0.0  0.95

julia> Matrix(a) * Matrix(b) # Calling `Matrix`
1×2 Array{Float64,2}:
 0.0  -6.175

julia> a * b # Without calling `Matrix`
ERROR: MethodError: no method matching *(::Array{Float64,1}, ::Array{Float64,1})

Discourse 13742

Wanted to verify that not allowing multiplication between Transpose and Adjoint was intentional. If so, I think it would be helpful to have a more informative Exception.

andreasnoack commented 6 years ago

I'm pretty sure this is intentional and I think the error message is okay. Asking to multiply transposed and adjoint vectors is basically asking to multiply two vectors which few people find surprising given our definition of *. We allow that users convert transposed and adjoint vectors to matrices since it is very convenient and we allow to multiply matrices so it is hardly surprising that Matrix(a) * Matrix(b) works and I don't see that as an argument for allowing you to multiply two vectors. Unless you can come up with a compelling use case, I'd suggest that we close this issue.

Nosferican commented 6 years ago

I believe that a lot of effort went into making Transpose and Adjoint very efficient. If one cannot use them in the context of matrix multiplication when there might be other Transpose or Adjoint people would be forced to call Matrix for safety and be losing on significant efficiency, no? Calling size on the objects is also misleading as it masks how those are considered for multiplication.

size(a)
(1, 1)

size(b)
(1, 2)

Based on the size of the arguments these would be considered <:AbstractMatrix.

In my packages, I use extensively Transpose with matrix multiplication, but I would have to check if there is another Transpose involved in any multiplication I would stop using it and calling Matrix on it.

StefanKarpinski commented 6 years ago

@andreasnoack, I disagree and think that this should work based on our choice to identify transposed/adjoint vectors with row matrices. Yes, it's a bit questionable form a co/contra variance perspective, but we aren't fussy about that anywhere else and we shouldn't here either. The only weird part is that you would then have a*b working where b'*a' does not, but I don't think that's the end of the world and I think it's the natural consequence of the design we settled on.

StefanKarpinski commented 6 years ago

In any case, we need a policy here: the sizes indicate that these should be multiplicable. If we're not going to go by that then what is the definitive critierion for whether two arrays are multiplicable?

jebej commented 6 years ago

Related: on 0.7 this gives the "correct" dimension mismatch error, instead of a method error:

julia> a * b
ERROR: DimensionMismatch("Cannot multiply two vectors")
andreasnoack commented 6 years ago

Transpose/Adjoint is supposed to behave like 1D in linear algebra operations like * and \ and behave like a row matrix in general arrays operations. Could we at least see a real concrete example where this would be useful?

tkf commented 6 years ago

FYI mul!(..., a, b) "works" at the moment:

julia> a = transpose([-6.5])
1×1 Transpose{Float64,Array{Float64,1}}:
 -6.5

julia> b = [0, .95]'
1×2 Adjoint{Float64,Array{Float64,1}}:
 0.0  0.95

julia> mul!(zeros(1, 2), a, b)
1×2 Array{Float64,2}:
 0.0  -6.175

I prefer the type system to force me to do the math correctly and I agree what @andreasnoack said. (This is what I expected from the discussion in JuliaCon 2017 | Taking Vector Transposes Seriously | Jiahao Chen - YouTube but I haven't read through the whole GitHub issue so I may be missing some details in the decision.)

Nevertheless, I think it's reasonable to make * and mul! consistent (although I suppose we can't do more than deprecation warning until 2.0). Adding to @StefanKarpinski suggested about having some kind of policy, it would be nice to have policy for how * and mul! relate. Something like A * B throws iff for all Y, mul!(Y, A, B) throws. What do you think?

Nosferican commented 6 years ago

I think having size for AbstractMatrix{<:Number} be the criterion for valid multiplication is a good one. I also think there should be consistency between * and mul!.