Indeed, nested differentiation of closures has been a well known problem for quite some time . In multiple places it is mentioned that a fix is on the works. Are there any news on this front?
At this point, this limitation essentially precludes the implementation of PINNs with ReverseDiff . Perhaps even worse is that recently I've found for some destructured Flux and Lux models it is actually possible possible to do ReverseDiff-over-ForwardDiff and obtain somewhat accurate results, but the gradients are ever-so-slightly wrong (see this discussion on Julia's discourse and this other issue). Likewise, reverse-over-reverse returns a zero gradient.
Has the root cause of this been identified? Would a fix necessarily involve overhauling the library with a tagging system? Is ForwardDiff currently the only AD library that can do nested AD with closures?
Indeed, nested differentiation of closures has been a well known problem for quite some time . In multiple places it is mentioned that a fix is on the works. Are there any news on this front?
At this point, this limitation essentially precludes the implementation of PINNs with ReverseDiff . Perhaps even worse is that recently I've found for some destructured Flux and Lux models it is actually possible possible to do ReverseDiff-over-ForwardDiff and obtain somewhat accurate results, but the gradients are ever-so-slightly wrong (see this discussion on Julia's discourse and this other issue). Likewise, reverse-over-reverse returns a zero gradient.
Has the root cause of this been identified? Would a fix necessarily involve overhauling the library with a tagging system? Is ForwardDiff currently the only AD library that can do nested AD with closures?