Enable backprop through the kernel

FalkonML / falkon

Large-scale, multi-GPU capable, kernel solver

https://falkonml.github.io/falkon/

MIT License

181 stars 22 forks source link

Enable backprop through the kernel #8

Closed Giodiro closed 2 years ago

Giodiro commented 4 years ago

This could allow optimization of kernel parameters with autograd.

Steps:

Make the kernel classes torch modules (but modules have a single operation, we have call(), mmv, dmmv..
Figure out how to backprop with the prepare, apply, finalize op sequence
Wrap ops (e.g. norm) in torch modules with backwards implemented

jacarvalho commented 4 years ago

I second this. In many applications, the input to the kernel comes from the output of a parameterized function with parameters \theta. It would be great if we could compute gradients wrt \theta.

Is this already implemented/planned to be implemented?

Giodiro commented 4 years ago

For now this is not implemented. Differentiating through the kernel would not be hard in itself thanks to KeOps, so if you had an already trained falkon model, then you could differentiate through its predictions. Would this be useful for your use-case?

In practice the current model is trained with conjugate gradients, and we cannot simply use such algorithm to differentiate with respect to arbitrary, parametrized feature transformations of the data :(

jacarvalho commented 4 years ago

My use-case involves differentiating through the predictions of an already trained model. So I might be able to use KeOps then. Thanks for tip!