Closed rem1A closed 2 years ago
There are two ways to specify the variable being differentiated with respect to. You can return it or you can store it into data structure.
In the case of reverse mode (the default, and what calling autodiff) does: each input will be +=
'd by its derivative with respect to each of the outputs, times the shadow of that output.
For example, consider the following:
void f(double* in, double* out) {
out[0] = in[0] * in[1];
out[1] = sin(in[1]);
}
To get the derivative with respect to out[0], you would do the following:
double out[2];
double d_out[2] = { 1.0, 0.0 };
double d_in[2] = {0};
__enzyme_autodiff(f, in, d_in, out, d_out);
printf("%f\n", d_in[0]); // d_out[0]/d_in[0];
printf("%f\n", d_in[1]); // d_out[0]/d_in[1];
Alternatively, one can compute the derivative wrt out[1] as follows:
To get the derivative with respect to out[0], you would do the following:
double out[2];
double d_out[2] = { 0.0, 1.0 };
double d_in[2] = {0};
__enzyme_autodiff(f, in, d_in, out, d_out);
printf("%f\n", d_in[0]); // d_out[1]/d_in[0];
printf("%f\n", d_in[1]); // d_out[1]/d_in[1];
This generalizes to provide a product of the jacobian and any vector, like as follows:
double out[2];
double d_out[2] = { v0, v1 };
double d_in[2] = {0};
__enzyme_autodiff(f, in, d_in, out, d_out);
printf("%f\n", d_in[0]); // v0 * d_out[0]/d_in[0] + v1 * d_out[1]/d_in[0];
printf("%f\n", d_in[1]); // v0 * d_out[0]/d_in[1] + v1 * d_out[1]/d_in[1];
If there is also a return (which is active) this will also += times the d(return)/din.
Hopefully that clarifies things (and please reopen/ask more questions if not).
Also please feel free to make a PR to docs to help explain for others!
Hello,
here I have a question of the principle of Enzyme:
I mean, if there is a function:
double func(double x) {
y = 0.5*x; return y; }
and I apply Enzyme on it:
__enzyme_autodiff(func, x, d_x);
obviously the d_x should be 0.5, since the return is "y", and the "d_x" should be dy/dx, which is very easy to understand
but when the form of the function is pretty complex, and there is no such a RETURN value, it becomes abstract to understand what is the "y" of the differentiation.
In addition, if there are multiple calculations involved the variable "x" like:
y = 0.5x; z = 0.4x;
and if there is no "return y" nor "return x", in this case what is the target of the differentiation? I know this can be tested by simple code, it's not the key question, I just wonder what is the principle of the differentiation when there is no return value in the function, is it dy/dx or dz/dx or any others?