Add custom `adjoint` method for `Tensor` type

jofrevalles commented 1 year ago

Summary

In this PR we fix the bug found in issue #40:

Added custom Base.adjoint functions for Tensor types. Only for two-dimensional Tensors, the labels are reversed.
Added tests for Base.adjoint.

Example

julia> using Tenet
julia> b = Tensor(rand(Complex{Float64}, 2, 2), (:i, :j))
2×2 Tensor{ComplexF64, 2, Matrix{ComplexF64}}:
 0.560191+0.926476im  0.381094+0.917558im
 0.703793+0.5878im    0.741097+0.658213im
julia> b'
2×2 Tensor{ComplexF64, 2, LinearAlgebra.Adjoint{ComplexF64, Matrix{ComplexF64}}}:
 0.560191-0.926476im  0.703793-0.5878im
 0.381094-0.917558im  0.741097-0.658213im
julia> b'.labels
(:j, :i)

codecov[bot] commented 1 year ago

Codecov Report

Merging #41 (8e2b4f0) into master (0bbd73d) will decrease coverage by 0.01%. The diff coverage is 0.00%.

@@            Coverage Diff            @@
##           master     #41      +/-   ##
=========================================
- Coverage    0.38%   0.38%   -0.01%     
=========================================
  Files          13      13              
  Lines         772     773       +1     
=========================================
  Hits            3       3              
- Misses        769     770       +1

Impacted Files	Coverage Δ
src/Tensor.jl	`0.00% <0.00%> (ø)`

... and 1 file with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

jofrevalles commented 1 year ago

Could you add a test case for a complex-vector and also nest some test set to differentiate them.

Okay! The test set to differentiate should be test/Integration/ChainRules_test.jl, right?

mofeing commented 1 year ago

Yes, just create a @testset inside the other @testset.

jofrevalles commented 1 year ago

Something like this?

        @testset "[adjoint]" begin
            a = Tensor(rand(2, 2), (:i, :j))
            b = adjoint(a)

            test_frule(contract, a, b)
            test_rrule(contract, a, b)
        end

Or should I also add the ChainRules.rrule and frules for the adjoint function?

mofeing commented 1 year ago

You only need to add rrules and frules for adjoint if what you pass to an AD backend is a function that calls adjoint. But here you only call contract.

This should be fine.

jofrevalles commented 1 year ago

Okay, now it should be working properly.

mofeing commented 1 year ago

@jofrevalles tests failed.

jofrevalles commented 1 year ago

Yes, I think that is because we need the fixes from #39

mofeing commented 1 year ago

Impossible because it was merged.

jofrevalles commented 1 year ago

But this branch does not have that commit, it was created before the merge.

mofeing commented 1 year ago

But this branch does not have that commit, it was created before the merge.

Tests run after "merging" the branches temporarily, so the repo where the tests run contain both commits from master and this PR. So tests run on this PR contain the commits from #39.

jofrevalles commented 1 year ago

But this branch does not have that commit, it was created before the merge.

Tests run after "merging" the branches temporarily, so the repo where the tests run contain both commits from master and this PR. So tests run on this PR contain the commits from #39.

Okay sorry, I did not know that. So it seems that adjoint(Matrix) is not a Matrix type, and this was causing problems. I changed the function to the conjugate conj (where conj(Matrix) isa Matrix) and now should be working fine.

mofeing commented 1 year ago

Okay sorry, I did not know that. So it seems that adjoint(Matrix) is not a Matrix type, and this was causing problems. I changed the function to the conjugate conj (where conj(Matrix) isa Matrix) and now should be working fine.

That's not entirely correct. It is not a Matrix but it is an AbstractMatrix.

julia> a = rand(ComplexF32,2,2)
2×2 Matrix{ComplexF32}:
 0.690143+0.435968im  0.325493+0.824115im
 0.734422+0.551891im  0.793292+0.556192im

julia> adjoint(a)
2×2 adjoint(::Matrix{ComplexF32}) with eltype ComplexF32:
 0.690143-0.435968im  0.734422-0.551891im
 0.325493-0.824115im  0.793292-0.556192im

julia> adjoint(a) isa Matrix
false

julia> adjoint(a) isa AbstractMatrix
true

I'm not sure just using conj is the solution. The nice thing about the LinearAlgebra.Adjoint type is that it performs the transpose and the conjugate lazily, so it doesn't duplicate memory when we perform a' or b' on ChainRules.rrule.

EDIT: notice that another test fails now.

jofrevalles commented 1 year ago

That's not entirely correct. It is not a Matrix but it is an AbstractMatrix.

The problem is that since it is not a Matrix type, somewhere it tries to convert a Tensor with a Matrix in data to a Tensor with LinearAlgebra.Adjoint{T, Matrix{T} in data:

julia> a = Tensor(rand(2, 2), (:i, :j))
2×2 Tensor{Float64, 2, Matrix{Float64}}:
 0.813942  0.640558
 0.148885  0.77122
julia> b = adjoint(a)
2×2 Tensor{Float64, 2, LinearAlgebra.Adjoint{Float64, Matrix{Float64}}}:
 0.813942  0.148885
 0.640558  0.77122
julia> test_rrule(contract, a, b)
test_rrule: contract on Tensor{Float64, 2, Matrix{Float64}},Tensor{Float64, 2, LinearAlgebra.Adjoint{Float64, Matrix{Float64}}}: Error During Test at /home/jofrevalles/.julia/packages/ChainRulesTestUtils/lERVj/src/testers.jl:202
  Got exception outside of a @test
  MethodError: Cannot `convert` an object of type 
    Tensor{Float64{},2,Matrix{Float64}} to an object of type 
    Tensor{Float64{},2,LinearAlgebra.Adjoint{Float64, Matrix{Float64}}}
  Closest candidates are:
    convert(::Type{T}, ::LinearAlgebra.Factorization) where T<:AbstractArray at /opt/julia-1.8.2/share/julia/stdlib/v1.8/LinearAlgebra/src/factorization.jl:58
    convert(::Type{T}, ::T) where T<:AbstractArray at abstractarray.jl:16
    convert(::Type{T}, ::T) where T at Base.jl:61
    ...

EDIT: notice that another test fails now.

This is because I should change the test of adjoint from adjoint(data) to conj(data). With this, it doesn't fail.

Right now we are not using adjoint in ChainRules, we are using conjugate since it is general. However, if you prefer to keep the adjoint for two-dimensional tensors, we can maybe extend the convert function so it knows how to handle LinearAlgebra.Adjoint{Matrix} types.

mofeing commented 1 year ago

Okay, let's discuss it in person.

mofeing commented 1 year ago

@jofrevalles I've found one of this edge cases: matrix multiplication.

Old definition with `adjoint`

If we define adjoint(::Tensor) as:

Base.adjoint(t::Tensor{T,2,A}) where {T,A<:AbstractMatrix{T}} = Tensor(adjoint(parent(t.data)), reverse(labels(t)); t.meta...)

then

julia> using Tenet, LinearAlgebra

julia> Y = Tensor([0 -im; im 0], (:i, :j))
2×2 Tensor{Complex{Int64}, 2, Matrix{Complex{Int64}}}:
 0+0im  0-1im
 0+1im  0+0im

julia> Y'
2×2 Tensor{Complex{Int64}, 2, LinearAlgebra.Adjoint{Complex{Int64}, Matrix{Complex{Int64}}}}:
 0+0im  0-1im
 0+1im  0+0im

julia> Y * Y'
2×2 Matrix{Complex{Int64}}:
 1+0im  0+0im
 0+0im  1+0im

julia> tr(Y * Y')
2 + 0im

julia> contract(Y,Y')
0-dimensional Tensor{Complex{Int64}, 0, Array{Complex{Int64}, 0}}:
2 + 0im

New definition with `conj`

But if we define adjoint(::Tensor) as...

Base.adjoint(t::Tensor{T,2,A}) where {T,A<:AbstractMatrix{T}} = Tensor(conj(parent(t.data)), labels(t); t.meta...)

then matrix multiplication * and contract mismatch

julia> Y
2×2 Tensor{Complex{Int64}, 2, Matrix{Complex{Int64}}}:
 0+0im  0-1im
 0+1im  0+0im

julia> Y'
2×2 Tensor{Complex{Int64}, 2, Matrix{Complex{Int64}}}:
 0+0im  0+1im
 0-1im  0+0im

julia> Y * Y'
2×2 Matrix{Complex{Int64}}:
 -1+0im   0+0im
  0+0im  -1+0im

julia> tr(Y * Y')
-2 + 0im

julia> contract(Y, Y')
0-dimensional Tensor{Complex{Int64}, 0, Array{Complex{Int64}, 0}}:
2 + 0im

julia> transpose(Y')
2×2 transpose(::Tensor{Complex{Int64}, 2, Matrix{Complex{Int64}}}) with eltype Complex{Int64}:
 0+0im  0-1im
 0+1im  0+0im

julia> tr(Y * transpose(Y'))
2 + 0im

Summary

contract is always correct while * might not. I think I prefer the adjoint call to ensure compatibility with other Julia code right now.

There is an easy solution: instead of just calling adjoint, we can call copy ∘ adjoint. This way the Adjoint structure disappears and gets converted into a Matrix again.

julia> adjoint(parent(Y)) |> typeof
Adjoint{Complex{Int64}, Matrix{Complex{Int64}}}

julia> adjoint(parent(Y)) |> copy |> typeof
Matrix{Complex{Int64}} (alias for Array{Complex{Int64}, 2})

We wouldn't benefit from lazy views of the original array (i.e. Y is not copied and then transformed when calling Y'), but this is something that we also lose when calling conj. I'm uploading this change to check it out against the tests.

mofeing commented 1 year ago

Btw, the contents of the metadata of the tensors were being tested... but they were initialized to an empty Dict so it was true always. 😆

If it is initialized to something, it fails. I'm submitting a fix also.

bsc-quantic / Tenet.jl