Closed sethaxen closed 5 years ago
I still don't understand how it was derived, but it looks like this is one of those adjoint things. Simply conjugating the input and output adjoints in Theano's implementation produces the correct adjoint for Zygote. Will submit a PR.
It's clear that these adjoints aren't supported from this line https://github.com/FluxML/Zygote.jl/blob/7dff4155f96b6675adb02916c7ad262e855abe1d/src/lib/array.jl#L416 which throws away the complex part of the adjoint, but there's another issue that persists whether
real
is used or not (usingngradient
from the tests):Unsurprisingly,
db
is 0, but thenda ≈ Δb
. If you remove thereal
, you also find thatdb ≈ Δa
. But iff
zeros outb
, then the correctda
is returned.I've been trying to work out the changes that would be necessary to fix this, but the code cites this paper, which as far as I can tell only gives the forward mode derivative of
exp
. I'm not certain how this reverse-mode implementation was derived from that. Can anyone shed any light on this?