tracel-ai / burn

Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.
https://burn.dev
Apache License 2.0
8.89k stars 440 forks source link

Support computing of partial derivatives #121

Closed Quba1 closed 11 months ago

Quba1 commented 1 year ago

Feature description

Add an (idiomatic) way of computing (mixed) partial derivatives of models.

Feature motivation

Most of the automatic differentiation libraries provide a first-class way of computing gradient of models of form f(X) -> Y. With the introduction of Physics-Informed Neural Networks there is a need for efficient computation of (mixed) partial derivatives (eg. df/dxdy) of models of form f(x, y, z) -> u.

In currently most popular ML libraries, computation of partial derivatives comes with a significant performance overhead and sometimes requires some "hackery".

As far as I see burn does not provide a method for computing partial derivatives of models. I believe that implementing this feature in Rust can benefit from its high performance. And if implemented as a first-class feature it can provide burn a significant advantage over other ML libraries.

(Optional) Suggest a Solution

In terms of API, pytorch and tensorflow provide a method for computing general gradients of form grad(model_output, one_of_model_inputs) -> partial_derivative_of_model_output_wrt_input.

This API is convenient and easy to understand but requires multiple function calls for second and higher derivatives, which can introduce performance overhead - however in Rust this might not be an issue.

nathanielsimard commented 1 year ago

I'm unsure I understand the concept of mixed partial derivatives, but it would be nice if burn could satisfy your use case.

The Gradients struct provides a way of getting partial derivates, though the API might be improved. From the concrete type in burn-autodiff you can do something like this:

fn run<B: Backend>() {
    let a = ADTensor::<B, 2>::random(...);
    let b = ADTensor::<B, 2>::random(...);
    let y = some_function(&a, &b);

    let grads = y.backward();

    let grad_a = grads.wrt(&a); // d some_function / da
    let grad_b = grads.wrt(&b); // d some_function / db
}
antimora commented 11 months ago

I am closing this because for now we do not have more information and specifics about the feature request. Please feel to reopen if more information is provided.