msurtsukov / neural-ode

Jupyter notebook with Pytorch implementation of Neural Ordinary Differential Equations
696 stars 120 forks source link

backpropagation doubt #1

Closed magus96 closed 5 years ago

magus96 commented 5 years ago

Something I haven't been able to get my head around is the need to find d(a)/dt. Isn't the gradient of the loss function enough for backpropagation? Sorry if it's a trivial doubt.

msurtsukov commented 5 years ago

You can directly backpropagate gradient through operations in ode solver. However this is pretty computationally expensive and prone to accumulating numerical error, also its memory consumption scales linearly with "time" between observations. Instead, you can solve another ode (the one with da/dt), which is called adjoint, backwards in time for calculating backpropagated gradient. This is the main feature of the approach. Hope I understood your question correctly.

magus96 commented 5 years ago

Yes. I read this as a fine point in your notebook later. I think this should be highlighted more. I finally understand this paper. Thanks a lot!