SciML / ModelingToolkit.jl

An acausal modeling framework for automatically parallelized scientific machine learning (SciML) in Julia. A computer algebra system for integrated symbolics for physics-informed machine learning and automated transformations of differential equations
https://mtk.sciml.ai/dev/
Other
1.43k stars 209 forks source link

NetworkSystem #341

Closed ChrisRackauckas closed 8 months ago

ChrisRackauckas commented 4 years ago

It would be nice to have the following work on any AbstractSystem. You supply a network given by a LightGraphs AbstractGraph, you provide a prototype component, and you have a syntax for unidirectional edges. Here's an example. Let g be the graph and suppose we've already defined a component c. Then we might want to construct:

c = ODESystem(...)
connection(cfrom,cto) = [cfrom.x + cto.x ~ 0,cfrom.y - cto.y ~ 0]
ODESystem(connection,g,c)

that clones c into c_i many versions, one for each node, and for every edge makes that connection equation. For more complex examples, we could have

ODESystem([(connection1,g1),(connection2,g2)],c)

or something like that so that way you can have different graphs specifying different edges and different equations associated with those edge graphs. The constraint would be that the nodes would have to be the same between all of the graphs, since you're just changing the associated edges.

I think this could cover neuron simulations, building RDME reaction networks, graph-based epidemiological models, and power systems models. Thoughts @isaacsas @asinghvi17 ?

mforets commented 4 years ago

you may be interested in https://github.com/blegat/HybridSystems.jl which implements hybrid automata.

ChrisRackauckas commented 4 years ago

I'm not sure that's related at all. I plan to add RegimeSwitchingSystems, but that's a fairly different problem from network dynamics.

mforets commented 4 years ago

oops, sorry.. i thought this was about an api for hybrid systems.

ChrisRackauckas commented 4 years ago

Maybe I should clarify a bit. It's for network dynamics where you have for example Lotka-Volterra on a network, so you have the Lotka-Volterra equations for x_i and y_i, and then you have terms in the differential equations that connect ODE sets i and j.

asinghvi17 commented 4 years ago

Let me rerun some simulations - as I recall, I kind of dropped this effort because (a) the DAE solve was quite a bit slower than the handwritten ODE solve, and there were some differences between the DAE and ODE solutions. If the DAE solution is accurate, then this approach makes sense to me; otherwise, I was going to try and build up the graph "by hand", by modifying the equations before creating a system.

ChrisRackauckas commented 4 years ago

The DAE should be as accurate: if not there's an error in the model form. But indeed, DAEs are as fast as stiff ODE solves (in mass matrix form since it uses the same methods), but if it's a non-stiff ODE then you need https://github.com/SciML/ModelingToolkit.jl/issues/340 in order to convert back down to an ODE for explicit methods.

asinghvi17 commented 4 years ago

Also, I think we should change the required arguments of connection to the following:

connection(
    sys_from, 
    sys_to, 
    edge_weight, # can be any arbitrary metadata, even a function
)::Vector{Equation}

This should take care of weighted connections, which are particularly necessary in neuron simulation. It occurs to me that we could have two constructors - a high level one which constructs the graph with systems for you, and a lower level one where you provide the system for each vertex. (We can't carry this information around in the graph, so we may as well carry it separately in a vector indexed by vertex number). This would allow anisotropic graphs, to selectively introduce defects into systems.

asinghvi17 commented 4 years ago

ODESystem([(connection1,g1),(connection2,g2)],c)

How would we unify the graphs? As far as I'm aware, there's no mechanism for graph fusion in LightGraphs. I imagine that a MetaDiGraph with connection functions or some user dispatch mechanisms on edges would be the best way forward for this.

ChrisRackauckas commented 4 years ago

I don't think we'd unify graphs. But we'd somehow need a canonical numbering system for the nodes.

asinghvi17 commented 4 years ago

As I understand it, LightGraphs handles the node numbering. For example, the constructor for a directed, weighted graph is:

SimpleWeightedDiGraph(12)

whose vertices are then 1:12. As I understand it, this is the canonical system for all LightGraphs graphs.

ChrisRackauckas commented 4 years ago

And if we have a different graph but with the same number of nodes and different edges, how can we node which nodes are the same?

asinghvi17 commented 4 years ago

Wouldn't that just be a degenerate graph, then? I'm not sure if we need to know which nodes are the same, since presumably if you pass two graphs with two coupling mechanisms, you want all of the coupling specified...unless I'm misinterpreting the purpose of this.

ChrisRackauckas commented 4 years ago

Well I want to say "this is node 10", and in graph 1, node 10 has an edge to node 11 so add these equations, in graph 2 it does not so don't add these equations.

asinghvi17 commented 4 years ago

That's just the vertex number, no? Here's some example code which might clarify this.

using LightGraphs, SimpleWeightedGraphs
graph = SimpleWeightedDiGraph(2)
e1 = SimpleWeightedEdge(1, 2, 1)
e2 = SimpleWeightedEdge(2, 1, 4)
add_edge!(graph, e1)
add_edge!(graph, e2)

looks like this:

Screen Shot 2020-05-06 at 7 11 36  30500PM
ChrisRackauckas commented 4 years ago

Oh, didn't know there was a global index. I thought all indexing was node local because of the edgelists being vector of vector. Okay, then I think we're good.

asinghvi17 commented 4 years ago

Here's some working example code, which only works for one edge per node. The system is composed of two independent, mutually coupled neuron pairs.

using LightGraphs, SimpleWeightedGraphs, ModelingToolkit, DifferentialEquations, Plots
using LightGraphs: src, dst, weights
using SimpleWeightedGraphs: SimpleWeightedDiGraph, SimpleWeightedEdge

# define the connectome graph

tng = SimpleWeightedDiGraph(4)

Dinhibs = [0.1, 0.1]

# Add mutual inhibition
for (ventral, dorsal, Dinhib) in zip(1:2, 3:4, Dinhibs)
    e1 = SimpleWeightedEdge(ventral, dorsal, Dinhib)
    e2 = SimpleWeightedEdge(dorsal, ventral, Dinhib)
    LightGraphs.add_edge!(tng, e1)
    LightGraphs.add_edge!(tng, e2)
end

# define the unit equation

@parameters t g e b

@variables v(t) w(t) F(t)

@derivatives D'~t

f(v) = v - v^3/3

fhn = [
    D(v) ~ f(v) - w + F,
    D(w) ~ e * (v - g * w + b)
]

function couple(sys_from, sys_to, weight)
    return [0 ~ sys_from.F + weight * sys_to.v]
end

# generate a single equation spec from the graph

# first, create all relevant systems
systems = [
    ODESystem(
        fhn, t, [v,w,F], [g,e,b];
        name = Symbol(
                "n" * string(i) # ModelingToolkit.map_subscripts(string(i))
        )
    )
    for i in vertices(tng)
]

# populate couplings from edges

couplings = Equation[]

for edge in edges(tng)
    append!(
        couplings,
        couple(
            systems[src(edge)],
            systems[dst(edge)],
            weight(edge)
        )
    )
end

connected = ODESystem(
    couplings,
    t,
    [],
    [];
    systems = systems
)

# Construct the initial condition, special-casing the first
# 2 neurons which comprise the head oscillator

u0 = [
    systems[1].v => -1.0,
    systems[1].w => -0.51,
    systems[1].F => -0.01,
    systems[2].v =>  1.0,
    systems[2].w => -0.49,
    systems[2].F => 0.01,
]

for sys in systems[3:end]
    append!(u0,
        [
        sys.v =>  1.0,
        sys.w => -0.49,
        sys.F => 0.01,
        ]
    )
end

# Construct the parameters, which are the same across
# all systems in this case

p0 = Pair{Operation, Float64}[]

for sys in systems
    append!(
        p0,
        [
            sys.g => 0.8,
            sys.b => 0.46,
            sys.e => 0.04,
        ]
    )
end

# Construct the ODEProblem

prob = ODEProblem(connected, u0, (0.0, 2000.0), p0)
sol = solve(prob, Rodas5())

plot(sol; vars = collect(1:3:12))

mtk_graph_sol

asinghvi17 commented 4 years ago

OK, this should now work for arbitrary simple graphs! We have to change the signature of the connection function to couple(system_to, systems_from, weights), though, since we can only have a single constraint equation per flux term.

using LightGraphs, SimpleWeightedGraphs, ModelingToolkit, DifferentialEquations, Plots
using LightGraphs: src, dst, weights
using SimpleWeightedGraphs: SimpleWeightedDiGraph, SimpleWeightedEdge

# define the connectome graph

tng = SimpleWeightedDiGraph(4)

Dinhibs = [0.1, 0.1]
Dgaps = fill(-0.05, 1)

# Add mutual inhibition
for (ventral, dorsal, Dinhib) in zip(1:2, 3:4, Dinhibs)
    e1 = SimpleWeightedEdge(ventral, dorsal, Dinhib)
    e2 = SimpleWeightedEdge(dorsal, ventral, Dinhib)
    LightGraphs.add_edge!(tng, e1)
    LightGraphs.add_edge!(tng, e2)
end

# We handle gaps separately, since the head oscillator 
# does not have a forcing neuron pair upstream
for (ventral, dorsal, Dgap) in zip(2:2, 4:4, Dgaps)
    e1 = SimpleWeightedEdge(ventral - 1, ventral, Dgap)
    e2 = SimpleWeightedEdge(dorsal - 1, dorsal, Dgap)
    LightGraphs.add_edge!(tng, e1)
    LightGraphs.add_edge!(tng, e2)
end

This creates a system which looks like: 2neuron-net

# define the unit equation

@parameters t g e b

@variables v(t) w(t) F(t)

@derivatives D'~t

f(v) = v - v^3/3

fhn = [
    D(v) ~ f(v) - w + F,
    D(w) ~ e * (v - g * w + b)
]

function couple(system_to, systems_from, weights)
    return [0 ~ system_to.F +  sum(weights .* getproperty.(systems_from, :v))]
end

# generate a single equation spec from the graph

# first, create all relevant systems
systems = [
    ODESystem(
        fhn, t, [v,w,F], [g,e,b];
        name = Symbol(
                "n" * string(i) # ModelingToolkit.map_subscripts(string(i))
        )
    )
    for i in vertices(tng)
]

# populate couplings from edge

couplings = Equation[]

edgs = edges(tng)
graph_weights = SparseArrays.SparseMatrixCSC(weights(tng))
for vertex in vertices(tng)
    sys_to = systems[vertex]
    weights = []
    systems_from = ODESystem[]
    for neighbor in inneighbors(tng, vertex)
        push!(systems_from, systems[neighbor])
        push!(weights, graph_weights[neighbor, vertex])
    end
    append!(
        couplings,
        couple(
            systems[vertex],
            systems_from,
            weights
        )
    )
end

connected = ODESystem(
    couplings,
    t,
    [],
    [];
    systems = systems
)

# Construct the initial condition, special-casing the first
# 2 neurons which comprise the head oscillator

u0 = [
    systems[1].v => -1.0,
    systems[1].w => -0.51,
    systems[1].F => -0.01,
    systems[2].v =>  1.0,
    systems[2].w => -0.49,
    systems[2].F => 0.01,
]

for sys in systems[3:end]
    append!(u0,
        [
        sys.v =>  1.0,
        sys.w => -0.49,
        sys.F => 0.01,
        ]
    )
end

# Construct the parameters, which are the same across
# all systems in this case

p0 = Pair{Operation, Float64}[]

for sys in systems
    append!(
        p0,
        [
            sys.g => 0.8,
            sys.b => 0.46,
            sys.e => 0.04,
        ]
    )
end

# Construct the ODEProblem

prob = ODEProblem(connected, u0, (0.0, 2000.0), p0)
sol = solve(prob, Rodas5())

plot(sol; vars = collect(1:3:12))

mtk_graph_sol

Function version which will go in a PR ```julia function construct_system(systems::Vector{ODESystem}, graph::LightGraphs.AbstractGraph, connector::Function) verts = vertices(graph) @assert length(systems) == length(verts) couplings = Equation[] edgs = edges(graph) graph_weights = SparseArrays.SparseMatrixCSC(weights(graph)) # Loop over each vertex to get edges for vertex in verts sys_to = systems[vertex] weights = [] systems_from = ODESystem[] for neighbor in inneighbors(graph, vertex) push!(systems_from, systems[neighbor]) push!(weights, graph_weights[neighbor, vertex]) end append!( couplings, couple( systems[vertex], systems_from, weights ) ) end return ODESystem( couplings, t, [], []; systems = systems ) end function construct_system(eqs::Vector{Equation}, graph::LightGraphs.AbstractGraph, connector::Function) return construct_system( [ ODESystem(eqs) for _ in vertices(graph) ], graph, connector ) end ```
isaacsas commented 4 years ago

Using a graph to represent spatial connectivity would work for RDME, though we should add some spatial SSAs that then exploit this structure (this has been on my todo for a while).

lindnemi commented 4 years ago

Actually this is pretty similar to what we are doing at NetworkDynamics.jl. Here is an example with FitzHugh-Nagumo oscillators on a brain atlas topology.

The question above regarding networked system is actually what I had in mind when asking about https://github.com/SciML/ModelingToolkit.jl/issues/315.

FHell commented 4 years ago

To follow up on @lindnemi s comment, we are also very much interested in integrating NetworkDynamics.jl with ModelingToolkit, and we have a couple of students starting to work in this direction in the next weeks. I will write a new issue that summarizes our state of thinking about this stuff, unless there is a better place for such discussions?

ChrisRackauckas commented 4 years ago

@FHell an issue would b great. Indeed, NetworkDynamics.jl is absolutely fantastic and I don't want to "cannibalize your efforts" in any way, instead trying to find a way to give a symbolic-numeric target to better support what you guys are doing (since there's a lot of things that can be done symbolically here!). I'd like to hear your thoughts before we continue further down this route, since you have more experience on this part.

asinghvi17 commented 4 years ago

I've been working with this a little more, and it seems to me that the best way to do this is to accept, at the lowest level, an adjacency matrix. This way, we don't have to worry about any special graph structures, and we have a robust fallback for weird graph types.

We could, then, also specialize (or allow users to specialize) the constructor on their graph types if there's a faster way of going about it.

ChrisRackauckas commented 4 years ago

Yeah, that makes sense.

FHell commented 4 years ago

Some ideas from ND.jl and the use case of power dynamics:

For power grid models it's crucial to allow different behaviour on different nodes and edges. What we do in NetworkDynamics.jl is to allow the user to not just pass in a single component that gets cloned, but an array of components, one for each vertex. We also allow an array of edge relationships (e.g. most edges are copper lines, some are more sophisticated controllable transformers). We are looking at whether multilayer modelling will allow us to keep things homogeneous per layer.

We also restricted ourselves to unweighted graphs. Instead we have a system that allows one to pass parameters to each vertex and edge separately. If we want to tune the strength of a connection after defining the dynamics it's much more convenient to use the DEFunction parameter for that. I think that approach would work here, too?

The design in network dynamics is based on what you, I believe, call causal modelling (since we started to work with control people a lot, I started to think of it as input/output, I sketched the design in the other issue). We first calculate the edges into a buffer, then we calculate the vertices as functions of the edges. (If the edges are specified by ODEs as well, as for more sophisticated power grid models we don't need the buffer). This means our dependency graph is dead trivial. We get a perfectly independent loop over edges, followed by a perfectly independent loop over vertices, thus the core function is trivially parallelizable by design. How does MTK make use of the parallelism?

We are stating to look more into control applications. I am not sure all the aggregations (in your case the forcing sum) are best expressed as equalities. Control logic often is more naturally code rather than equalities. We were planning to have a non MTK function that does the right thing in the loop. How arbitrary can functions of MTK variables get at the moment?

I am really amazed at the performance of the system builder for large equations. I remember having real issues with the very early on when I protoyped some networked systems in MTK (maybe a year ago?). Are you looking into how this scales for very large systems (also in memory use)?

My feeling is that we might want to transition to MTK Networks eventually, but in the short term we probably want to do a more gradual approach, using it in some places strategically.

ChrisRackauckas commented 4 years ago

If we want to tune the strength of a connection after defining the dynamics it's much more convenient to use the DEFunction parameter for that. I think that approach would work here, too?

Not sure what you're mentioning here.

We also allow an array of edge relationships (e.g. most edges are copper lines, some are more sophisticated controllable transformers). We are looking at whether multilayer modelling will allow us to keep things homogeneous per layer.

That's what I plan to do with different graphs: give to graphs each doing something slightly different, instead of 1 graph with different edge types. I think the two are equivalent though.

How does MTK make use of the parallelism?

Many ways. Threading, distributed, and now we're starting to write schedulers to split the graph more effectively. I plan to train a neural network cost model so we auto-split in a way that minimizes the overhead of over-subscribing one thread. We'd cut components down at compile time so that if they are unbalanced we can rebalance the threads. You can find a bunch of other issues open on these topics, and my JuliaCon talk is actually going to be about that part of MTK.

I am really amazed at the performance of the system builder for large equations. I remember having real issues with the very early on when I protoyped some networked systems in MTK (maybe a year ago?). Are you looking into how this scales for very large systems (also in memory use)?

We testing quite frequently on million equation systems. There are funny things we hit in the Julia compiler, but we've been working around them or getting them fixed (the most curious being that splitting + matters a lot when you have 100 terms in a line 😆 ). MTK is also going to be the test bed for SymbolicUtils performance (https://github.com/JuliaSymbolics/SymbolicUtils.jl/issues/61, https://github.com/JuliaSymbolics/SymbolicUtils.jl/issues/62), so file an issue if you hit anything.

lindnemi commented 4 years ago

If we want to tune the strength of a connection after defining the dynamics it's much more convenient to use the DEFunction parameter for that. I think that approach would work here, too?

Not sure what you're mentioning here.

The edge weights shouldn't be hardcoded in the graph structure, but rather expressed as system parameters, i.e. the p in the f(dx, x, t, p) syntax. That's especially useful when the behaviour of an edge/connection depends on more than one parameter or when you wan't to perform an optimization on them.

asinghvi17 commented 4 years ago

Offering both approaches through using MTK parameters as edge weights would be trivial to do. Edit: Ah, I see what you're saying now. There are multiple ways to achieve this - you could not provide weights, and do this optimization in the coupling function.

FHell commented 4 years ago

Offering both approaches through using MTK parameters as edge weights would be trivial to do. Edit: Ah, I see what you're saying now. There are multiple ways to achieve this - you could not provide weights, and do this optimization in the coupling function.

Let's make this concrete for a power grid case, this is something we are looking to do: To parametrize the system you draw current demand at each of the 10.000 vertices from a random distribution. You also have different node types, so every node has a different number of parameters (but they all include the power infeed P). Now you want to simulate what happens when 10 out of 40.000 edges fail, you want to do that by randomly selecting 10 edges and setting their weight to 0. You want to run this experiment 10.000 times.

So to do that, it would be amazing if I could just manipulate parameters like this:

p.P = background .+ randn(n_vertices)

e_select = rand(1:n_edges, n_failures)
p.admittance[e_select] = 0.

Is this what you had in mind as well?

I believe if you build the NetworkSystem to play nice with ArrayPartitions this should be possible. The question is, if you have inhomogeneous vertices where parameters only occur at some nodes but not at others, what do you do then?

(In ND.jl we actually often have callable structs for the vertices that carry around their parameters, and then manipulate parameters directly inside the rhs, less than ideal user experience).

Edit: For further fun: 200 of the nodes have control capabilities parametrized by a set of control parameters, and we want to tune these control capabilities by optimizing them. So we need a wrapper that takes in an array and then distributes the values to these 200 special nodes.

asinghvi17 commented 4 years ago

That should be easy enough (with a couple of caveats). Currently, parameters as arrays don't work so well in MTK (as far as I know) but you should be able to achieve this kind of result with the prototype we have up now.

ChrisRackauckas commented 4 years ago

Currently, parameters as arrays don't work so well in MTK (as far as I know)

I assume you mean ArrayPartition? Yeah, the generated code assumes arrays. MTK wants to flatten the memory layout so it can control it for auto-parallelism. We need better tools to allow the transformation, since in the end it's better off being declarative here. The reason is:

(In ND.jl we actually often have callable structs for the vertices that carry around their parameters, and then manipulate parameters directly inside the rhs, less than ideal user experience).

Exactly this. In an imperative language space, you mix abstraction with memory layout and thus performance. The abstraction of "what things go together" might not necessarily jive with the best way to split computations for parallelism and DAE index reduction. So what we're trying to do here is just internally have everything by index, associate it by name, take values from you and have you cede control of memory layout so that we can choose our own (which can be a requirement for parallelism). There are ways around it, like making Variable{Matrix}, but those just clump in a way that should be sane.

So I think the way to make NetworkSystem play nice with ArrayPartition is to do that outside MTK and have it trace to something for the NetworkSystem, which would lower to code that has all of the right naming and referencing, but is not tied to the memory layout there. That would fix performance on more than a few projects I have found a way into, where the abstraction didn't match how it should be computed.

ChrisRackauckas commented 3 years ago

The design of this should now start to get reconsidered given the acausal modeling. For example, with:

https://mtk.sciml.ai/dev/tutorials/acausal_components/

let's say you make a bunch of circuit things. It would be really nice to be able to specify a graph of connections for that? So I think we're ready for a high level API on it.

hexaeder commented 3 years ago

Reading this thread I've noticed that there are two different ways to think about network systems, I'm not sure whether everyone is on the same page here:

  1. Each node is represented by a system, each edge represents a identity relation between variables of the connected nodes.
  2. Each node as well as each edge is represented by a system. In this case the edges become somewhat statefull. This is how (we) would model PowerSystems, diffusion networks, ...

I'll refer to the second one in this post. Obviously the version 2 can be transformed into version 1 by substituting each "statefull" edge with a new node between the the connected ondes.

I am from the NetworkDynamics.jl group aswell. Currently we're working on a MTK-Network interface. We have an small working example for this here. This example works with our own wrappers around ODESystems but works quite similar to the acausal modeling.

There need to be three main components:

The last one is quite important because it specifies which kind of connections between the different subsystems need to be defined. It should be possible to have heterogeneous node-systems and edge-systems as long as every subsystem fulfills those definition. For example for power grids it does not matter how your node dynamics work as long as it calculates a voltage. Vice versa it does not matter how the edge dynamic works as long as it calculates a current based on the voltage of the connected nodes.

One important point is the edge aggregation: each node might have a different number of connected edges. There are two ways around this:

Another major point is the direction of the edges. This topic actually causes a lot of headache. I'd say the most basic and most important way of modeling is by using directed graphs. Each edge system will take two variables: source_vertex and destination_vertex and calculates the edge_state based on those. Each node should only consider the incoming edges as connected edges.

In the real world those edge functions often have some kind of symmetry/antisymmetry. It might be useful to implement higher level interfaces on top of those purely directed graphs for undirected graphs with symmetry properties. But i think it is important to get the basic interface right first.

ChrisRackauckas commented 3 years ago

introduce some kind of aggregator function/system, which takes all of the surrounding edge variables and sums them up for example (i.e. the node will only see the sum of all current on the connected edges)

That's what https://github.com/SciML/ModelingToolkit.jl/issues/814 is for.

ChrisRackauckas commented 3 years ago

In the real world those edge functions often have some kind of symmetry/antisymmetry. It might be useful to implement higher level interfaces on top of those purely directed graphs for undirected graphs with symmetry properties. But i think it is important to get the basic interface right first.

Yes, definitely agreed.

ChrisRackauckas commented 8 months ago

It seems there are downstream extensions which do this kind of thing now.