JuliaDiff / ReverseDiff.jl

Reverse Mode Automatic Differentiation for Julia
Other
348 stars 57 forks source link

Bug: Derivative of transposed-vector times matrix is incorrect. #235

Closed Red-Portal closed 1 year ago

Red-Portal commented 1 year ago

Hi, Found a correctness bug that seems only to affect ReverseDiff.

julia> A = [1 2; 3 4]; x = [5, 6];

julia> [ReverseDiff.gradient(y -> sum(y'*A), x) ;; 
        ReverseDiff.gradient(y -> sum(A'*y), x) ;;
             Zygote.gradient(y -> sum(y'*A), x)[1] ;;
             Zygote.gradient(y -> sum(A'*y), x)[1] ]
2×4 Matrix{Float64}:
 4.0  3.0  3.0  3.0
 6.0  7.0  7.0  7.0
mohamed82008 commented 1 year ago

Thanks for reporting this. Do you have any time to dig deeper into this?

zelong-yin commented 1 year ago

Hi, I also found that the memory allocation benchmarks are no longer correct.

When I tried the example codes it now shows spurious memory allocations reversediff

It seems that these allocations are associated with the use of transposes. When I delete the adjoint operators in the definition they goes away.

Red-Portal commented 1 year ago

@mohamed82008 Hi Mohamed! Unfortunately, my AD knowledge is insufficient to debug the AD frameworks. I do feel that I need to learn this stuff though...

Red-Portal commented 1 year ago

@zelongyin2000 I think it would be better to open a separate issue for that

Red-Portal commented 1 year ago

Thanks @willtebbutt !