SciML / DiffEqFlux.jl

Pre-built implicit layer architectures with O(1) backprop, GPUs, and stiff+non-stiff DE solvers, demonstrating scientific machine learning (SciML) and physics-informed machine learning methods
https://docs.sciml.ai/DiffEqFlux/stable
MIT License
851 stars 154 forks source link

Implementing Neural DAE's #124

Closed ali-ramadhan closed 4 years ago

ali-ramadhan commented 4 years ago

Looking through neural_de.jl it doesn't look like it'll be hard to implement a NeuralDAE.

I'm interested in using them so I'll try to add a NeuralDAE type, test it out, and open a PR.

They're also mentioned in arXiv:2001.04385 [cs.LG] so would be good to add.

ChrisRackauckas commented 4 years ago

It's not that it's hard, it's just hard to know what the right interface is. I don't think there's ever a reason for a flat out neural DAE. However, a universal DAE makes sense for imposing physical constraints, like constant energy. The issue is that I'm not sure that can be packaged up more than telling someone to use concrete_solve. As a flat out layer, I'm not sure a neural DAE achieves that much.

ali-ramadhan commented 4 years ago

I see.

However, a universal DAE makes sense for imposing physical constraints, like constant energy.

Yeah, conservation laws was exactly my use case so I think I want to implement a universal DAE instead of a neural DAE.

The issue is that I'm not sure that can be packaged up more than telling someone to use concrete_solve.

Hmmm, so maybe I should just get a simple example working with concrete_solve. But even having a thin wrapper that calls concrete_solve might be nice to make the user API consistent.

ChrisRackauckas commented 4 years ago

Maybe an interesting API would be you specify physical constraints, and it defines a neural network for the DAE portion on the other equations. I.e. if you have 5 state variables and specify 2 conservation laws, it builds a 5->3 NN and appends the 2 conservation laws as the DAE system, which will then find a dynamical system that obeys your conservation. We can make this in mass-matrix ODE and in fully implicit form.

ChrisRackauckas commented 4 years ago

For those looking to work on this, set the default sensealg to TrackerAdjoint() until our sensealgs are singular mass matrix compatible, which is currently WIP.

ali-ramadhan commented 4 years ago

So I ended up reformulating my DAE problem into an ODE problem so I could enforce energy conservation via a simple Flux.jl layer and continue using NeuralODE so energy is conserved now, which is nice.

Seems that time-stepping true DAE's that cannot be reformulated requires something like the IDA method from Sundials. Reformulating as an ODE problem with a mass matrix might be an option but not haven't played around to see if I can train a neural differential equation with a mass matrix.

ChrisRackauckas commented 4 years ago

Seems that time-stepping true DAE's that cannot be reformulated requires something like the IDA method from Sundials. Reformulating as an ODE problem with a mass matrix might be an option but not haven't played around to see if I can train a neural differential equation with a mass matrix.

You can with a mass matrix and TrackerAdjoint. Full DAE adjoints is something @YingboMa is working on over the next month, so IDA can work. There is an undocumented pure Julia version though being released soon. Our DAE story is really coming together in the next few months.

ChrisRackauckas commented 4 years ago

Added with https://github.com/JuliaDiffEq/DiffEqFlux.jl/pull/159