JuliaAttic / ReverseDiffSource.jl

Reverse automated differentiation from source
MIT License
47 stars 12 forks source link

Tensor Functions #11

Open rleegates opened 9 years ago

rleegates commented 9 years ago

Hi Frédéric,

as far as I can tell, the package currently only supports functions that yield a scalar value. Any plans on extending this to tensor-valued functions of tensors?

Best regards, Robert

fredo-dedup commented 9 years ago

Hi,

I had no plans for this but yes this is possible. Calculation time should be O(n). It would need some prior thinking on how to present the results, especially for higher order derivations. I labelled your issue as an enhancement request (not sure I'll have time for this though).

Don't know if this would work for you, but you have a workaround with the current version :

rleegates commented 9 years ago

Hi Frédéric,

thank you for your quick reply. I'll try your workaround when I find the time, as I'm currently involved in another project. As far as I can tell, the workaround provides for the differentiation of tensor-valued functions with respect to scalars, however, I'm sure it could be extended to more complicated structures. Just FYI, what I'd ideally be looking for is the computation of partial and/or total derivatives of functionals of the type f_{ij}(g_{ij}(x_{ij}), k(x_{ij})) such that its derivative wrt x yields d/dx_{kl} f_{ij} = df_{ij}/dg_{mn} dg_{mn}/dx_{kl} + df_{ij}/dk dk/dx_{kl} in which either contractions (first term) or dyadics (second term) appear. I was pondering doing this symbolically, however your package would be a nice alternative, as it would enable me to skip the code generation from the symbolic expression. In addition, my use-cases become even more complicated when the tensor function is applied to the eigenvalues of x_{ij}, a point where I'd be unsure if symbolic computations will suffice. If I can be of help in implementing such features, we could continue this discussion by email.

Best regards, Robert

alexbw commented 8 years ago

FYI, if just writing down all the gradients of lots of tensor-valued functions is the blocker, this has been done (at least twice) in the autograd-family of libraries.

In autograd: https://github.com/HIPS/autograd/blob/master/autograd/numpy/numpy_grads.py In the Torch version of autograd: https://github.com/twitter/torch-autograd/blob/master/src/gradfuns.lua

EDIT: the most confusing gradients for those exhaustive links above are those having to do with tensor resizing, indexing and broadcasting. I'm happy to help, and walk through the code with anyone interested in porting them to Julia, if that's interesting to someone.

dfdx commented 8 years ago

@alexbw I'm definitely interested to port tensor gradients to Julia (e.g. see dfdx/Espresso.jl#2 for some details). Would you suggest any "entry point" to get started (either in code or in theoretical papers)?

alexbw commented 8 years ago

On the issue you linked, I think you're conflating the partial derivatives you need to write with the method you will use to perform automatic differentiation of output w.r.t. input. Indeed we do require functions to have scalar output in torch-autograd, but I believe autograd supports calculation of the Jacobian (non-scalar output) by doing multiple passes of the function, once per column of the Jacobian. So, if you get scalar-valued outputs working, you just need some small extra effort to get tensor-valued outputs.

I would recommend just lifting the gradients from autograd or torch-autograd. In autograd the file is called "numpy_grads.py", I believe, and it's "gradfuns.lua" in torch-autograd.

On Sun, Aug 14, 2016 at 12:29 PM Andrei Zhabinski notifications@github.com wrote:

@alexbw https://github.com/alexbw I'm definitely interested to port tensor gradients to Julia (e.g. see dfdx/Espresso.jl#2 https://github.com/dfdx/Espresso.jl/issues/2 for some details). Would you suggest any "entry point" to get started (either in code or in theoretical papers)?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JuliaDiff/ReverseDiffSource.jl/issues/11#issuecomment-239682674, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJ4j2eWMtju55jQZH_AOgCZaaWuqPWmks5qf0JJgaJpZM4DjlTK .

dfdx commented 8 years ago

@alexbw Thanks for your answer. For the code you linked, am I right saying that gradients there are represented as Python/Lua functions that take previous gradients (i.e. gradients of arguments) and produce a new gradient for the current operation itself? That is something like this:

grad_1 = make_gradient_myfunc(A, B)
grad_1(already_computed_gradients_of_A_and_B)

Also I don't really understand meaning of so common unbroadcast function there. I see that it sums out some dimensions of a tensor, but which ones and for what purpose?

alexbw commented 8 years ago

Yes, the function signature, for some function like e.g. sum(x,y):

gradSum[1] = function(incomingGradient, answerOfSum, x, y) ... end gradSum[2] = function(incomingGradient, answerOfSum, x, y) ... end

to calculate the partial gradients for each argument of sum(x,y)

Unbroadcast (used to be called "sumToMatchShape") is used a lot to match gradient shapes when there has been replication. If you replicate a tensor in the forward pass, the action you must take in the backwards pass is to sum (not select) the replicated parts together.

On Tue, Aug 16, 2016 at 7:01 PM Andrei Zhabinski notifications@github.com wrote:

@alexbw https://github.com/alexbw Thanks for your answer. For the code you linked, am I right saying that gradients there are represented as Python/Lua functions that take previous gradients (i.e. gradients of arguments) and produce a new gradient for the current operation itself? That is something like this:

grad_1 = make_gradient_myfunc(A, B) grad_1(already_computed_gradients_of_A_and_B)

Also I don't really understand meaning of so common unbroadcast function there. I see that it sums out some dimensions of a tensor, but which ones and for what purpose?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JuliaDiff/ReverseDiffSource.jl/issues/11#issuecomment-240265873, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJ4jzgvByusr7z3gxK9GO5AQW3MbtrIks5qgkFNgaJpZM4DjlTK .