ACEsuit / ACE.jl

Parameterisation of Equivariant Properties of Particle Systems
65 stars 15 forks source link

Differentiation via ChainRules.jl #27

Closed cortner closed 2 years ago

cortner commented 3 years ago

We should probably redesign differentiation in a more organised way, likely following the ChainRules.jl ideas or even using ChainRules.jl directly. In particular this should enable us to leverage AD tools when needed.

cortner commented 3 years ago

Comments by @ettersi on Zulip:

This sounds like the kind of problem that the Julia community is currently working on in ChainRules.jl. I'd therefore recommend you either directly implement their frule and rrule interface, or at least have a good look at it to learn about the various pitfalls which lead to this design.

In a nutshell, here's what I believe their approach is / should be (some design aspects are still work in progress):

I believe this framework applies to your situation pretty straightforwardly. In particular:

in the invariant case, should dphi * dAAdrr be a EuclideanVector or Adjoint{EuclideanVector}?

In the ChainRules.jl framework, what you are computing here is (g = rrule(phi, rr); g(1)) (the final output is g(1) because that corresponds to the vjp 1 * dphi_drr). I'd therefore choose g = rrule(phi,rr) to be a function g(::Real) -> ::EuclidianVector. The Adjoint is implied by the fact that what we are computing is the output of an rrule.

In the equivariant case should phi * dAAdrr be a EuclideanMatrix?

In this case, g = rrule(phi,rr) becomes a function g(::Vector) -> ::Vector (unless I'm misinterpreting your problem statement). I'd probably leave it at that, i.e. I would not translate this g(dphi) into a matrix, but of course the details here depend on what you want to do with your derivative.

What is the corresponding "thing" for Spherical vectors or matrices?

Define a differential type for your spherical vectors and matrices, and everything else should follow from there. I guess we can talk about this further once we have an agreement on the earlier points.

cortner commented 3 years ago

Just to record what I've currently implemented as a stop-gap solution: (typing is too strong but ok for now... this is just illustrative, the actual implementation varies...)

*(phi::AbstractProperty, dAA::SVector) = reshape( phi.val[:], dAA',  Size( size(phi.val)..., length(dAA) ) )

that way we can re-use the matrix multiplication to get from AA to B but it is very risky, and definitely needs rethinking!! This is no more than a hack.

cortner commented 3 years ago

another comment from JuLIP: adjoints can be used to evaluate derivatives in a lazy way. This means we can have a relatively simple unified implementation for multiple arguments with respect to which we might want derivatives.

cortner commented 2 years ago

this is all happening and in fact mostly done so I'm closing it. new rules will be added as needed.