Discussion: Should models be in derivative form?

caglorithm commented 4 years ago

Hey peoples, I've been thinking about this, specifically in the context of @jajcayn's recent findings that the thalamic model is really sensitive to dt in the Euler integration scheme. It would be great if we could just run models in other schemes as well, to be more flexible. But the current "philosophy" is to have an integrator for each model, i.e. every model deals with it for itself.

Implementing each model as in an dx/dt = f(x) kind of form would allow us to build different integrators around it. Obviously this means a lot of work: who integrates the noise? How do we handle delays?

This would be a big effort but maybe it's the safest way to go... thoughts?

caglorithm commented 4 years ago

@jajcayn, irrc didn't you actually recently lay out the aln model in dx/dt-form? :)

jajcayn commented 4 years ago

Yes I did:) Currently, I have something that I call experimental simulator and the basics are as follows:

it obeys hierarchical modelling idea within three layers: brain network is composed of N nodes and each node is composed of M neural masses, ergo the base building block is one neural mass (e.g. aln excitatory is one mass, aln inhibitory is second mass, these two masses create one aln node and N aln nodes make aln brain network)
each neural mass is defined as dx/dt equation in the symbolic form (think sympy or symengine)
each node is defined as a list of any number of neural mass and each node handles the connection between the masses (i.e. within node connectivity and within node delays)
then the brain network is defined as a list of any number of nodes and the network handles the connection between the nodes (i.e. grand-connectivity - usually SC and network delays - usually derived from length matrix from DTI)
when you want to integrate, the dx/dt equations from each mass from each node is laid out into one huge vector, hence, in the end, you have dx/dt for the whole brain network (for 80 nodes of ALN you have a vector of 80 x (7 + 6) as 80 nodes and 7 state variables for excitatory ALN and 6 state variables for inhibitory ALN mass)
I have implemented two integrators:
- jitcdde which uses Bogacki–Shampine Runge–Kutta method with adaptive dt for delayed differential equations - this package takes state vector in symbolic form and compiles it into C code which is then called by the python integrator - it's reasonably fast (slower than current neurolib implementation though) but most importantly - it uses adaptive dt which is kinda important for many neuroscientific problems
- numba integrator based on Euler scheme - here the symbolic state vector is "printed" into a string and that string is compiled into numba jitted function to the prepared template. you must specify dt here
as for the noise - this is a tricky part. What I did, in the end, is to connected "stimulus creation" and noise generation into one. So each neural mass can take so-called system_input and this system_input is provided to the integrator on the run method. You can provide zero input, hence no stimulus no noise, you can provide Wiener process (basic Gaussian noise) or Ornstein-Uhlenbeck process and on top of that, you can create sinusoidal, square, linear etc. inputs to the system. So you need to generate your noise in any form up-front and then just integrate. For jitcdde backend the noise is provided as cubic Hermite spline, so the adaptive integrator can interpolate for any dt, for numba backend the noise is provided as a simple array with correct dt spacing between the values. For the inputs, I created convenience functions, so it's a matter of one line to create a system input.

I am not saying at all this is the best we can do, but if you have dx/dt you may create heterogeneous networks (in theory each node can have a different number and types of masses) very fast since the full state vector is just concatenation of all dx/dt.

At the time being, I implemented these masses: aln mean-field, FitzHugh-Nagumo, Hopf, thalamic mass model, Wilson-Cowan and Wong-Wang - each of these is implemented as dx/dt mass dynamics

jajcayn commented 4 years ago

here is the code for the model base: gitlab and here an example of how to combine various base models into the heterogeneous net (example of one thalamic node and 1 node of ALN): custom models

for now, its hosted on TU's GitLab, in my all-purpose repository - I may create a standalone repo of out it...

caglorithm commented 4 years ago

First of all: huge respect! Am I correct insofar that the experimental simulator (it could be named Multibrain 🌈🧠, lol) could be in principle added as a new model to neurolib and the integration with jitccde is handled internally by the model?

I've completely missed the (admittedly repeatedly discussed) point that it is in fact desirable to handle everything internally because of performance reasons. I think using numba all the way (instead of having to call a numba-dx/dt repeatedly) is the reason why neurolib is so damn fast compared to other integrators.

jajcayn commented 4 years ago

yes, neurolib is incredibly fast because it uses numba but not only that - also it expects each node to be the same, so you can do two loops: one per time, one per node... If you will allow each node to be different - then you cannot do the inner loop through nodes and you'll lose some of the speed.

and yes :D, I believe this can be implemented as a so-called Multibrain model based on neurolib's Model superclass... my idea was that:

Multibrain can be used for prototyping of crazy experimental heterogeneous networks with each node being different or each node/mass having totally different parameters or if you really need adaptive dt
neurolib base models can be used when you have homogeneous network and really need the speed (because you're doing optimisation or exploration)

I can look into implementing experimental models into Multibrain model class. Btw, really like the name multibrain :)

caglorithm commented 4 years ago

It's better than calling it Heterobrain 🌈🏴‍☠️! Very stoked on seeing the first heterogeneous brain network simulation! Unfortunately, we can't claim to be the first though: TVB has nest (I think) integration now so you can run a spiking network next to a neural mass model, very cool!

jajcayn commented 4 years ago

@caglorithm how do you feel about supporting sampling dt? MultiModels support sampling dt by default (since one integrator uses adaptive dt so you need to define something), but I am using sampling dt also with numba backend: first integrate at model's dt and then just subsample... is it ok if I add support for sampling dt also to neurolib models so we keep compatibility? Especially with some models, you ought to use very small integration dt but then the results are unnecessary huge...

caglorithm commented 4 years ago

Hey, it's a good idea. But how would you implement it? Right now the model saves state_vars and outputs in dt resolution. You could subsample outputs after each run (or chunk), sure, but state_vars need to be at dt if they're used as IC's.

On a side note: It sounds pretty cool what you're working on with the MultiModel! I'd be happy if you would lay out the rough ideas how it works and the things that you've thought of, if you can take the time!

jajcayn commented 4 years ago

sampling dt: you're right, I haven't realised that... will think about it:) MultiModel ideas: will do tomorrow! :)

jajcayn commented 4 years ago

Basic ideas of the `MultiModel` approach

Hierarchical architeture

Almost all whole-brain models internally follow the three-layer hierarchical architecture. From the bottom-up view, you have neural masses (typically representing a cortical column of some hundreds of thousands of neurons). These masses are typically either excitatory (representing pyramidal neurons) or inhibitory (by default inhibitory interneurons, but of course you can represent any population).

Now, upon connecting two or more masses you get a node. Again, by default, you would connect one excitatory mass and one inhibitory mass, like in the case of ALN or Wilson-Cowan model. Hence the node is defined by their masses, their within-node connectivity and within-node delays.

Finally, at the top layer, you have a whole-brain network. The network is a collection of nodes, coupled via structural connectivity matrix (typically from DTI) and network delays (again typically from DTI).

Code design

As obvious from the above-written, the basic building block is NeuralMass class, representing one neural mass. Each mass must have defined the following properties and attributes:

_derivatives: definition of the right-hand side of the mass equation, in symbolic calculus (think sympy)
parameters: mass-specific parameters
a couple of constant attributes such as num_state_variables, mass_type, coupling_variables, num_noise_variables, index (within node), etc. Part of the neural mass equation can be system_input - this can be any stimulus or noise (basic Wiener process or Ornstein-Ohlenbeck process)

Now the Node is a collection of masses (with their indices and derivatives etc. properly defined) plus within-node connectivity matrix and within-node delays. What node internally does is it computes connectivity matrix (hence something like a "within-node connectivity * coupling variables matrix(time - delays)") and uses that as a full derivatives vector for the node.

The Network does the same, but one layer up - hence it takes a collection of nodes and connectivity and delays matrices and computes full coupling matrix and lays out full equations for the system just by gathering equations from all the nodes and all the masses inside nodes.

Since the base classes for Nodes and Networks don't really care about the particular dynamics within a node, you can mix various masses inside a node and various nodes inside a network. It just works... E.g. my latest experiment is one thalamic node (TCR excitatory mass + TRN inhibitory mass) and one AdEx node (ALN excitatory mass w. adaptation + ALN inhibitory mass) written as a 2-node network. It just works.

Integration and system input

So far I implemented two integrators: integrator with adaptive time-step for DDEs based on Shampine and Thompson scheme implemented with the help of jitcdde package and basic Euler integration implemented with numba for speed. Code-wise is integration implemented using BackendIntegrator mixin class and both Node and Network bases actually inherit from BackendIntegrator meaning you can both run and integrate nodes and network without any additional steps! In general: jitcdde backend is very reliable since it uses adaptive dt and reasonably fast (for composite networks not so much), numba on the other hand is blazing fast but you need to know the integration dt which, based on my thalamic experiments, might be tricky and for some systems, you really need to go very low (= high memory pressure and in general huge space constraints when you e.g. do exploration).

For the system input, I implemented helpers that help you with that: I created classes for creation no input (all zeroes), Wiener process (i.e. white noise), Ornstein-Uhlenbeck process, step stimulus, sinusoidal stimulus, square stimulus, and linear ramp stimulus. Of course, you can combine inputs, i.e. you can have sinusoidal stimulus with white noise on top of that. For jitcdde backend with adaptive dt, the full system input is passed as cubic Hermite spline (since integrator needs to interpolate based on actual dt used in integration), for numba backend it's simply numpy array with correct time dimension.

Model builder

My idea is that this framework will pose as a Lego or basic building blocks for experiments with mass models. I provided NeuralMass definitions for AdEx mass, FitzHugh-Nagumo mass, Hopf mass, thalamic mass model, Wilson-Cowan mass and both reduced and EXC/INH Wong-Wang mass. You can build any node using these masses and then any network using any nodes. Of course, creating a new/novel mass is easy - I will provide a simple guide, but in general, it's all about defining the right-hand side of the equation and few basic descriptive attributes and everything else is taken care of. Now imagine: you can create a layer-resolving cortical node just by stacking not 1 excitatory and 1 inhibitory masses into it, but rather 5 of each, provide some reasonable connectivity and delays (both 10x10 matrices in this case) and you done. E.g. AdEx layer-resolving cortical node in 10-15lines. And, of course, all masses can have different parameters and all nodes within a network can have e.g. different within-node connectivity. Possibilities are endless if you work with basic building blocks defined as dy/dt and stack them. Thus you can have heterogeneity at any level (intrinsic mass parameters, node connectivity, node delays, mass types, mass equations, you name it).

Connection to current `neurolib` architecture

So this is not easy... I started doing it. At this point, instead of base Model used by currect neurolib models I created MultiModel class (which is based on Model class) with different init. What I want to achieve is the same functionality of Model class but tailored on the above-described framework. My idea was that MultiModel will take only one parameter on init and that can be either Node or Network - all other Model attributes and parameters (such as Model.params, Model.default_output, etc) are inferred directly from the model instance. The parameters will probably have a tree-like structure (with dot-based dictionary, hence something like Model.params["node0.mass0.tau"]).

AFAIK when MultiModel would implement all basic Model methods, such as run, autochunk etc, the whole neurolib functionality would just work out-of-the-box - mainly exploration and evolutionary optimisation.

Final notes

I am currently in the process of writing MultiModel in the branch feature/multimodel. You can git checkout it and see all the code. The model builder is copy-pasted from my previous code and I also added tests (a lot of them). I want to be sure everything works, so e.g. I am testing each model definition by running it with both backends and correlating results (they need to be the same).

As for the PR I was thinking I would finish basic draft for the MultiModel and then fire up a PR and test all the features and notebooks afterwards, so the final PR is not 10k lines. Right now it's about 5.5k lines...

So that's about it :)

P.S. there are some interesting scientific questions within the framework - we can discuss them in person

caglorithm commented 4 years ago

Thanks a lot for the overview! Will go into detail when I find time! I think though that adding 5.5k lines will cause a lot of review and getting things right from the beginning, especially things like consistency in how we call things like Node, Mass, Network, Brain Network, etc... I've seen in the code that there are places where the terms are not very clear to me. I was thinking that adopting a more application-agnostic terminology like "nodes" and "network" or "circuit" could be better than "mass" or "brain area" since the applications could vary a lot and the models used could be just an oscillator or a spiking neuron etc. Hope you agree!

Also I hope you understand that merging a big PR like this is going to take work from my side and might take some time because I need to get into the code in order to be able to maintain it. That being said: I'm super stoked and want to say that the code looks really great, had a look at it yesterday night!

jajcayn commented 4 years ago

btw, another possibility I already thought about in order to ease the maintenance is to not include this MultiModel thingy inside neurolib per se, but rather create another repo as neurolib-devs, call it something and just copy the structure of neurolib, so the two are compatible (e.g. for explorations and optimisation)... that way we'd have two separate repositories / packages that can communicate (mainly MultiModel to base neurolib, another way around we do not need) but can be maintained differently, theoretically by different people...

neurolib-dev / neurolib