kappa language like specification of modular models?

sdwfrost commented 4 years ago

Take a look at this paper, in which they use kappa language to specify epidemiological models (rather than cellular models). The repo for the code is https://github.com/ptti/rule-based-models. In this framework, models are specified in terms of agents that has combinations of states e.g. this specified individuals with states S,E,I,R (infection status) as well as another state (y or n).

%agent: P(x{s e i r} g{y n})

Is it worth considering either a kappa language parser, or subsume some of the ideas into the MTK IR?

ChrisRackauckas commented 4 years ago

The approach that we should take here is like what we're doing with the other languages. MTK IR is growing to be able to capture the ideas of other languages, and then we can parse Kappa into MTK. https://github.com/isaacsas/ReactionNetworkImporters.jl is an example of this, specifically on BioNetGen files and on an older DiffEqBio, but that will soon be targeting MTK. We plan to keep expanding the list, with one big one to support being SBML of course.

ChrisRackauckas commented 3 years ago

Modeling languages and domain-specific languages are not going to be a part of MTK. MTK should just be a very flexible IR which simplified modeling languages write into. It should be feature complete, but I don't think it should have alternative forms itself. That should be left to extension packages, like Catalyst.jl is a good example. An implementation of Kappa which generates MTK code would be an interesting project though. And if the IR needs to support something to make it work, that's worth an issue.

wwaites commented 3 years ago

Hi @ChrisRackauckas, I'm one of the authors of that paper and I've been talking about this a bit with @sdwfrost. I definitely agree with the overall approach of transforming DSLs to an intermediate representation. I am not sure if MTK can represent what is needed to simulate this class of models. The reason is that rules like in the Kappa calculus are not finitely representable as reactions. Here's an intuition to see why this is the case. Consider the following simple rule that creates a polymer. The agent has an upstream and a downstream site. If there is a pair of agents that have the right unbound sites, they get bound together at some rate,

A(u[.]), A(d[.]) -> A(u[1]), A(d[1]) @ k

notice that this happens locally -- we do not care if the downstream site of the first one, or the upstream site of the second is bound. If you wanted to write out the equivalent reactions, you'd have to do,

A + A -> AA
AA + A -> AAA
AA + AA -> AAAA
AAA + A -> AAAA
AAA + AA -> AAAAA
...

which is an infinite set of reactions involving an infinite number of species, and a correspondingly infinite-dimensional system of ODEs if that's the way you want to simulate it.

Thinking in terms of Petri nets, the way to get here is as follows. Rather than thinking of tokens as a multiset, think of them as a labelled discrete graph. Everything still works, but now the way that you find the propensities of the transitions is you enumerate subgraph isomorphisms. The in-neighbours of a transition make a pattern graph which is matched in the mixture or population graph. The out-neighbours form the replacement. So, haven chosen a transition to fire as you do in a jump process, you remove the vertices from a particular match and replace them with the replacement. So far, this just describes a particularly inefficient way to simulate a Petri net. But now, we ask, what if we do not require the mixture / population to be a discrete graph, and we permit edges? Not only edges existing in some "given" way but allow transitions to create or destroy edges. Now we've gone from Petri nets to something like DPO graph rewriting, which is a different beast.

Using truncation, it is possible to generate finite systems of reactions or ODEs. The KaDE tool (@feret) that comes with KaSim does this. For some applications that is fine, but it is very easy to write down systems of rules that result in systems of ODEs that are too big to conveniently integrate. And we want to be able to write down these kinds of systems because they neatly express dynamics that we're interested in for concrete purposes.

For my current work, I've had to extend the KaSim language to be able to have binding sites of arbitrary valence. In standard Kappa, a binding site can have a single bond. I'm working with explicit contact networks for infectious disease and need to represent things like households which don't fit that structure easily. So I've made a rough cut of a simulator that does this, which is good enough for my current needs but isn't very fast and doesn't support the full set of features that I'd eventually like it to.

A first step might be just to have a parser and a transformation to the MTK IR with truncation. That would be interesting because I would like to do this in a way that can be composed with other models and having a uniform IR is a good way to facilitate that. But really we would want a fully-fledged graph rewriting simulator otherwise much of the benefit of Kappa is lost. That means teaching MTK how to do stochastic graph rewriting. I think it's worth it, but it's a major undertaking, not a weekend coding project.

ChrisRackauckas commented 3 years ago

I am not sure if MTK can represent what is needed to simulate this class of models.

Sure, any specific feature request along those lines is a good issue to open. The (rather subtle) point of this closure is that implementing a language like Kappa is not in scope for MTK. As an example, Catalyst.jl is a higher level language which generates MTK IR, which then optimizes and generates the simulation code. So a Kappa-like language or Kappa bindings to Julia such a model specification form should similarly be a separate package all about such a language. Any features that are required in MTK to make that happen are their own issues which should be discussed.

This issue was specifically "kappa language like specification of modular models?", and the final answer here is no, I don't think Kappa, SBML, BioNetGen, etc. parsers should be this library as this library already has a ton of stuff it does. Instead, those are separate packages which we recognize in the documentation, but leaves MTK without 200 parsing dependencies.

which is an infinite set of reactions involving an infinite number of species, and a correspondingly infinite-dimensional system of ODEs if that's the way you want to simulate it.

Why not use an InfiniteArray of reactions?

A first step might be just to have a parser and a transformation to the MTK IR with truncation. That would be interesting because I would like to do this in a way that can be composed with other models and having a uniform IR is a good way to facilitate that. But really we would want a fully-fledged graph rewriting simulator otherwise much of the benefit of Kappa is lost. That means teaching MTK how to do stochastic graph rewriting. I think it's worth it, but it's a major undertaking, not a weekend coding project.

Could this not just make use of tearing and structural_simplify? That's already a form of graph rewriting, non-stochastic but heuristic approximation to the NP hard problem.

SciML / ModelingToolkit.jl

kappa language like specification of modular models? #486