FluxML / Zygote.jl

21st century AD
https://fluxml.ai/Zygote.jl/
Other
1.48k stars 213 forks source link

Wrong adjoint for Cholesky #932

Closed theogf closed 3 years ago

theogf commented 3 years ago

I think the adjoint for the Cholesky ldiv is wrong : https://github.com/FluxML/Zygote.jl/blob/56f411876d545ff758d78053d89c45b57d88e41c/src/lib/array.jl#L473

The NamedTuple uses the wrong names, I think it should be info instead of status : https://github.com/JuliaLang/julia/blob/91c297b6c983aed2828600bcd9b8ba6494f747b6/stdlib/LinearAlgebra/src/cholesky.jl#L85 :

struct Cholesky{T,S<:AbstractMatrix} <: Factorization{T}
    factors::S
    uplo::Char
    info::BlasInt

    function Cholesky{T,S}(factors, uplo, info) where {T,S<:AbstractMatrix}
        require_one_based_indexing(factors)
        new(factors, uplo, info)
    end
end
DhairyaLGandhi commented 3 years ago

Cc @willtebbutt

theogf commented 3 years ago

I can confirm that replacing status by info solve (my) eventual problems

DhairyaLGandhi commented 3 years ago

Could you post an MWE for later? Also would you be up for a pr to this change?

theogf commented 3 years ago

I unfortunately cannot produce a MWE (although I tried), I think accum needs to be involved? And yeah happy to make a PR!

willtebbutt commented 3 years ago

Hmm yeah, this probably has something to do with me. @theogf if not a MWE, could you provide a stack trace? I might be abl to reverse-engineer a MWE from that.

theogf commented 3 years ago

It was very long so I shortened it where I think you have enough info. For reference the guilty one is AugmentedGaussianProcesses.jl on the Z_opt_fixes branch when trying to do AD on the ELBO (see autotuning.jl)

Error During Test at /home/theo/.julia/dev/AugmentedGaussianProcesses/test/likelihood/gaussian.jl:21
  Got exception outside of a @test
  ArgumentError: NamedTuple{(:uplo, :status, :factors), Tuple{Nothing, Nothing, Matrix{Float64}}} keys must be a subset of NamedTuple{(:uplo, :info, :factors), Tuple{Nothing, Nothing, Diagonal{Float64, Vector{Float64}}}} keys
  Stacktrace:
    [1] #s65#84
      @ ~/.julia/packages/Zygote/9X8lu/src/lib/lib.jl:20 [inlined]
    [2] var"#s65#84"(::Any, x::Any, y::Any)
      @ Zygote ./none:0
    [3] (::Core.GeneratedFunctionStub)(::Any, ::Vararg{Any, N} where N)
      @ Core ./boot.jl:571
    [4] (::Zygote.var"#back#187"{:Σ, Zygote.Context, AugmentedGaussianProcesses.Posterior{Float64}, Cholesky{Float64, Matrix{Float64}}})(Δ::NamedTuple{(:uplo, :status, :factors), Tuple{Nothing, Nothing, Matrix{Float64}}})
      @ Zygote ~/.julia/packages/Zygote/9X8lu/src/lib/lib.jl:224
    [5] (::Zygote.var"#1697#back#188"{Zygote.var"#back#187"{:Σ, Zygote.Context, AugmentedGaussianProcesses.Posterior{Float64}, Cholesky{Float64, Matrix{Float64}}}})(Δ::NamedTuple{(:uplo, :status, :factors), Tuple{Nothing, Nothing, Matrix{Float64}}})
      @ Zygote ~/.julia/packages/ZygoteRules/OjfTt/src/adjoint.jl:59
    [6] Pullback
      @ ./Base.jl:33 [inlined]
    [7] (::typeof(∂(getproperty)))(Δ::NamedTuple{(:uplo, :status, :factors), Tuple{Nothing, Nothing, Matrix{Float64}}})
      @ Zygote ~/.julia/packages/Zygote/9X8lu/src/compiler/interface2.jl:0
    [8] Pullback
      @ ~/.julia/packages/ZygoteRules/OjfTt/src/ZygoteRules.jl:11 [inlined]
    [9] Pullback
      @ ~/.julia/dev/AugmentedGaussianProcesses/src/gpblocks/posterior.jl:5 [inlined]
   [10] Pullback
      @ ~/.julia/dev/AugmentedGaussianProcesses/src/gpblocks/latentgp.jl:242 [inlined]
willtebbutt commented 3 years ago

Oh, amazing, this is an example of @mzgubic 's recent fix in action -- previously this was silently dropping either the info or status field. In this particular example it probably didn't matter because they'll be nothings anyway, but it's good to resolve this.

Here's a MWE:

using LinearAlgebra
using Zygote

function reproduce_theo_error(C::Cholesky, x::Vector{<:Real})
    return sum(C \ x) + logdet(C)
end

A = randn(5, 5);
C = cholesky(Symmetric(A'A + I))
x = randn(5)
Zygote.gradient(reproduce_theo_error, C, x)

which produces

julia> Zygote.gradient(reproduce_theo_error, C, x)
ERROR: ArgumentError: NamedTuple{(:uplo, :status, :factors), Tuple{Nothing, Nothing, Matrix{Float64}}} keys must be a subset of NamedTuple{(:uplo, :info, :factors), Tuple{Nothing, Nothing, Diagonal{Float64, Vector{Float64}}}} keys
Stacktrace:
 [1] #s65#84
   @ ~/.julia/packages/Zygote/9X8lu/src/lib/lib.jl:20 [inlined]
 [2] var"#s65#84"(::Any, x::Any, y::Any)
   @ Zygote ./none:0
 [3] (::Core.GeneratedFunctionStub)(::Any, ::Vararg{Any, N} where N)
   @ Core ./boot.jl:571
 [4] (::Zygote.var"#41#42"{typeof(∂(reproduce_theo_error))})(Δ::Float64)
   @ Zygote ~/.julia/packages/Zygote/9X8lu/src/compiler/interface.jl:41
 [5] gradient(::Function, ::Cholesky{Float64, Matrix{Float64}}, ::Vararg{Any, N} where N)
   @ Zygote ~/.julia/packages/Zygote/9X8lu/src/compiler/interface.jl:59
 [6] top-level scope
   @ REPL[10]:1

Reckon you could open a PR with your fix and this MWE as a regression-prevention test @theogf ?

DhairyaLGandhi commented 3 years ago

Hmm, there may be more cases where we have disjointed sets of fields to accum over.

willtebbutt commented 3 years ago

Yeah, almost certainly. We should consider implementing a simple test util that we can run on everything that provides a NamedTuple cotangent to check that its fieldnames match that of the type of its primal.

willtebbutt commented 3 years ago

But for the time being, I'm just glad it's no longer silently producing the wrong result.

DhairyaLGandhi commented 3 years ago

Most cases would be handled automatically. It's only when we don't describe the complete named tuple in an adjoint where this might come up.

willtebbutt commented 3 years ago

Sorry, yeah, I meant test utils to apply to @adjoints -- I agree that Zygote generally takes care of this.