jaamestaay / M2R-Group-29

This repository complements the MATH50002 (M2R) research report and presentation, a compulsory year two Mathematics module at Imperial College London. We introduced and explored neural networks and the techniques required to train a neural network. We then discussed neural ODEs and their increased accuracy, before extending to neural CDEs.
0 stars 0 forks source link

Theory: Neural ODEs (and possibly CDEs) and their applications #3

Closed jaamestaay closed 3 months ago

LiuJiankuan commented 3 months ago

I got roughly three applications of machine learning: 1.⁠ ⁠Hand-written number identification: this is basically a neural network function such that there are n*n pixels input and 10 numbers(0-9) outputs, we need to train this model by the concept called back-propagation to correct the parameters, finally in order to increase the precision of identification. 2.⁠ ⁠⁠continuous dynamics of financial time series data, such as stock prices and market indices. Neural ODEs naturally capture the continuous evolution of market data by embedding neural network architectures within the differential equation solving process. This allows for real-time updates and predictions based on the latest market information, offering investors more accurate risk assessments and decision-making tools. 3.⁠ ⁠⁠autonomous driving technology. Machine Learning combined with Ordinary Differential Equations (ODEs) is utilized to model and control vehicle dynamics accurately. ODEs help simulate the vehicle's motion through Newtonian mechanics, predicting future states based on current velocities and accelerations.

tl2622 commented 3 months ago

Theory: I have mainly studied section 3.3 of the rough path lecture notes. Here is a summary. Main goal: optimize the loss function to achieve best fit method: compute the gradient of the loss function with respect to the parameters and then use SGD or adam to locate min

Neural ODEs and CDEs:

  1. direct back propagation through the internal operations of a differential equation solver (proof explained in other sections)

This method is known as discretise-then-optimise. Generally speaking, this approach is fast to evaluate and produces accurate gradients, but it is memory- inefficient, as every internal operation of the solver must be recorded.

  1. adjoint equation with two additional calls on ODE solvers (as two proofs are similar, proof will be explained only in the CDE section)

Adjoint-based methods are usually slightly slower to evaluate because one needs to recalculate yt in the backward pass. Also, they might introduce small errors compared to direct backpropagation through the ODE solver, so they should generally be used when memory is a concern.

tl2622 commented 3 months ago

Application: Neural CDEs for irregular time series: focus on the main part of the paper as well as appendix A and D python code packages are available on github (see paper for detail)

New directions in the applications of rough path theory: focus on chapter 3 includes a theoretical algorithm to learn neural cde (might not be needed for our implementation, but a good material for theoretical background) the connection to RNN is currently an extra material

tl2622 commented 3 months ago

Here are some useful resources on python packages for cdes: https://github.com/patrick-kidger/NeuralCDE/tree/master https://github.com/patrick-kidger/torchcde https://github.com/rtqichen/torchdiffeq https://github.com/patrick-kidger/diffrax (the author said that this is more powerful)