Open piever opened 1 month ago
Any chance that https://github.com/JuliaDiff/ForwardDiff.jl/pull/669 solves this?
Somehow it doesn't... Unless I messed something up, I checked out https://github.com/JuliaDiff/ForwardDiff.jl/pull/669 (manually changing version ForwardDiff version number to 0.10
) and still get
julia> Zygote.gradient(l1, cu(cplx))
(ComplexF32[NaN32 + NaN32*im],)
which is weird, because indeed hypot
differentiates just fine:
julia> f(x) = hypot(x, 0, 0)
f (generic function with 1 method)
julia> ForwardDiff.derivative(f, 0.0)
1.0
julia> ForwardDiff.derivative(f, -0.0)
-1.0
Bug description
I've experienced the following inconsistency between GPU and CPU gradient computation for
sum(abs, _)
.The last one is particularly problematic, as it leads to
NaN
values in the gradient that may be hard to understand in a more complex model.Slack discussion
On Slack, @mcabbott explained to me the most likely cause for this:
sum(abs, x)
tosum(abs.(x))
and the broadcasting part is differentiated via ForwardDiffjulia> abs(ForwardDiff.Dual(0,1) + 0im) Dual{Nothing}(0.0,NaN)
(jl_FHvUua) pkg> st Status
/tmp/jl_FHvUua/Project.toml
[052768ef] CUDA v5.5.2 [e88e6eb3] Zygote v0.6.71