We should make a new crate that can take in a Rose function and transform it (recursively, i.e. including its dependencies) to produce a new function that computes its Jacobian-vector product (JVP) or vector-Jacobian product (VJP), then expose those transformations in TypeScript.
While recursively transforming the function graph, we'll need to cache transformations for functions we've already seen to avoid exponential blowup, so we might as well continue caching them between top-level calls to the autodiff module, perhaps using rc::Weak since that way we're not keeping them around any longer than they would be anyways.
One wrinkle for caching is that we plan to implement these transformations in multiple layers: first forward-mode, then unzip, then transpose. So if the original function has a Weak pointer to its forward-mode derivative, which has a Weak pointer to its unzipped pair, the linear one of which has a Weak pointer to its transpose, then the middle two could get dropped even if the original function and its VJP are both still there. So maybe we should have each derived function keep a "paper trail" of strong Rc pointers to the functions they came from. This does mean we'd be holding onto more functions than strictly necessary, though.
We should make a new crate that can take in a Rose function and transform it (recursively, i.e. including its dependencies) to produce a new function that computes its Jacobian-vector product (JVP) or vector-Jacobian product (VJP), then expose those transformations in TypeScript.
While recursively transforming the function graph, we'll need to cache transformations for functions we've already seen to avoid exponential blowup, so we might as well continue caching them between top-level calls to the autodiff module, perhaps using
rc::Weak
since that way we're not keeping them around any longer than they would be anyways.One wrinkle for caching is that we plan to implement these transformations in multiple layers: first forward-mode, then unzip, then transpose. So if the original function has a
Weak
pointer to its forward-mode derivative, which has aWeak
pointer to its unzipped pair, the linear one of which has aWeak
pointer to its transpose, then the middle two could get dropped even if the original function and its VJP are both still there. So maybe we should have each derived function keep a "paper trail" of strongRc
pointers to the functions they came from. This does mean we'd be holding onto more functions than strictly necessary, though.