Closed RS-Coop closed 3 months ago
It seems that it would be possible to further optimize this by dropping the field for x, and instead just updating partials field of dual_cache1. I can't seem to figure out the process for doing this, is it possible?
It's possible. See some of the tricks in the Jacobian-vector product code.
Are we saving anything by using dual_cache2 when the Zygote gradient operation cannot be performed in-place?
Nope. That's more for when we get around to doing this with ReverseDiff.jl or Enzyme.jl
Can further specification of types improve performance? For instance, I know that the function type F could be further specified as taking a vector as input and returning a scalar.
Nope. But fixing the tag types would be good. It's not good to tag nothing.
Slightly off topic from the other questions, is the symmetry of the Hessian automatically being exploited by the AD tools, or does that even make sense?
It is. There's a step at the end of a Hessian calculation which uses it.
Thanks for the responses! I went looking for the tricks you mentioned in the JVP code, but I can't seem to find anything useful, could you point me in a more specific direction?
DifferentiationInterface.jl has an hvp!
function which works on many different backend combinations (not just ForwardDiff over Zygote) and allows partial caching, if you're interested
The hvp!
function is not an operator.
Though this operator exists in the package and the issue just wasn't closed https://github.com/JuliaDiff/SparseDiffTools.jl/blob/master/src/differentiation/jaches_products.jl#L292-L330
The hvp! function is not an operator.
Which is why I called it a function. The associated operators will probably be found in AutoDiffOperators.jl, @oschulz seemed enthusiastic about making the switch to DI:
It also needs to satisfy the full SciMLOperator interface before it'll be useful in solvers, but yes that's the missing step.
Using the existing hvp operator and the forward over back AD hvp function in this repository I have made the following modified hvp operator.
I have a couple of questions related to this:
x
, and instead just updatingpartials
field of dual_cache1. I can't seem to figure out the process for doing this, is it possible?dual_cache2
when the Zygotegradient
operation cannot be performed in-place?F
could be further specified as taking a vector as input and returning a scalar.Thanks for the help!