itt-ustutt / num-dual

Generalized (hyper-) dual numbers in rust
Other
53 stars 6 forks source link

Generically sized dual numbers #58

Closed prehner closed 1 year ago

prehner commented 1 year ago

This PR uses more generic underlying data structures so that the derivative parts of vector dual (including second order dual and hyper dual) numbers can use either constant or dynamically sized vectors. This enables the calculation of gradients with respect to an arbitrary number of variables.

In theory the performance of dual numbers with constant sizes should not be changed. In practice, benchmarks showed minor regressions in performance. It is worth noting though, that while the benchmarks in this crate are reproducible they are somewhat unreliable an intransparent with respect to compiler settings. A benchmark in the feos crate showed no significant loss in performance.

Even though this allows the automatic differentiation of functions involving many variables, at some point dual numbers (i.e. forward mode AD) become less performant than reverse mode AD (backpropagation).

To accomodate the changes, the DualNum can not have Copy, Send and Sync as supertraits. For statically allocated dual numbers these marker traits are still implemented an if needed additional trait bounds can be used for specific applications.

Finally, for dynamically sized dual numbers the number of derivatives is only known at compile time. Therefore it is impossible for functions like from or zero to allocate the appropriate amount of memory. To avoid unexpected behavior arising from possibly empty vectors, all derivative parts are wrapped in a Derivative struct that contains the actual Vector within an Option. This has the added benefit that algebraic operations can be avoided if one or more of the operands is not a function of the independent variable.