This PR uses more generic underlying data structures so that the derivative parts of vector dual (including second order dual and hyper dual) numbers can use either constant or dynamically sized vectors. This enables the calculation of gradients with respect to an arbitrary number of variables.
In theory the performance of dual numbers with constant sizes should not be changed. In practice, benchmarks showed minor regressions in performance. It is worth noting though, that while the benchmarks in this crate are reproducible they are somewhat unreliable an intransparent with respect to compiler settings. A benchmark in the feos crate showed no significant loss in performance.
Even though this allows the automatic differentiation of functions involving many variables, at some point dual numbers (i.e. forward mode AD) become less performant than reverse mode AD (backpropagation).
To accomodate the changes, the DualNum can not have Copy, Send and Sync as supertraits. For statically allocated dual numbers these marker traits are still implemented an if needed additional trait bounds can be used for specific applications.
Finally, for dynamically sized dual numbers the number of derivatives is only known at compile time. Therefore it is impossible for functions like from or zero to allocate the appropriate amount of memory. To avoid unexpected behavior arising from possibly empty vectors, all derivative parts are wrapped in a Derivative struct that contains the actual Vector within an Option. This has the added benefit that algebraic operations can be avoided if one or more of the operands is not a function of the independent variable.
This PR uses more generic underlying data structures so that the derivative parts of vector dual (including second order dual and hyper dual) numbers can use either constant or dynamically sized vectors. This enables the calculation of gradients with respect to an arbitrary number of variables.
In theory the performance of dual numbers with constant sizes should not be changed. In practice, benchmarks showed minor regressions in performance. It is worth noting though, that while the benchmarks in this crate are reproducible they are somewhat unreliable an intransparent with respect to compiler settings. A benchmark in the feos crate showed no significant loss in performance.
Even though this allows the automatic differentiation of functions involving many variables, at some point dual numbers (i.e. forward mode AD) become less performant than reverse mode AD (backpropagation).
To accomodate the changes, the
DualNum
can not haveCopy
,Send
andSync
as supertraits. For statically allocated dual numbers these marker traits are still implemented an if needed additional trait bounds can be used for specific applications.Finally, for dynamically sized dual numbers the number of derivatives is only known at compile time. Therefore it is impossible for functions like
from
orzero
to allocate the appropriate amount of memory. To avoid unexpected behavior arising from possibly empty vectors, all derivative parts are wrapped in aDerivative
struct that contains the actual Vector within anOption
. This has the added benefit that algebraic operations can be avoided if one or more of the operands is not a function of the independent variable.