Implement adaptors to permit gradient calculation for functions of more complex codomains

Mikolaj / horde-ad

Higher Order Reverse Derivatives Efficiently - Automatic Differentiation library based on the paper "Provably correct, asymptotically efficient, higher-order reverse-mode automatic differentiation"

BSD 3-Clause "New" or "Revised" License

34 stars 6 forks source link

Implement adaptors to permit gradient calculation for functions of more complex codomains #68

Closed Mikolaj closed 9 months ago

Mikolaj commented 2 years ago

This is a (probably harder) continuation of #66, which is about domains. Currently, codomains of any ranks are permitted, but it would be great to extend this similarly as the domains in #66 (to nested tuples of traversables of any ranks). It's possible that a good first step is to extend internal representation of codomains of objective functions to Domains r, which has always been the internal representation of the domains. However, this is a dual situation, not a symmetric one, so a direct approach may prove better.

Edit: Extending the internal representation of codomain probably requires adding another type of delta expressions with a constructor that tuples delta expressions of any rank. This should not be hard to do, but it may break performance or introduce extra complexity to the gradient evaluation algorithm, so we should be mindful of the costs.

Mikolaj commented 2 years ago

Yet another approach may be defining tuple instances for IsPrimal, etc., though it feels messy.

Mikolaj commented 10 months ago

This is done (without adaptors) for the non-symbolic case by defining DualPart @() (HVectorPseudoTensor ranked), see the test testSin0revhFoldZipR. There is a clear path toward symbolic differentiation of complex codomain objective functions (see the comment and example define DerivativeStages AstHVector to make this possible), but it's hard to predict if any blockers emerge somewhere along this path. The last step would be adaptors so that any nested tuples (and more) are handled and not only heterogeneous vectors.

Mikolaj commented 9 months ago

The symbolic differentiation of objective functions with codomains that are heterogeneous vectors of tensors is now complete and the test passes. It's also terribly unwieldy, just as any examples using the untyped heterogeneous vectors. Adding adaptors may make such use cases much more pleasant at the cost of complicating the type of rev, which is already too cryptic to be informative to a casual reader, anyway. In any case, worth doing even if relatively low priority.

Mikolaj commented 9 months ago

Differentiating of functions of an arbitrary (nested tuple+) codomain type just fell out of the folds work. The separate tuple components are probably totally non-shared, but determining workarounds or additions to improve this needs to wait for another day.