AlgebraicJulia / Catlab.jl

A framework for applied category theory in the Julia language
https://www.algebraicjulia.org
MIT License
614 stars 58 forks source link

Managing Parameters and Provenance of Scientific Models #716

Open olynch opened 1 year ago

olynch commented 1 year ago

We have three related problems in the algebraicjulia ecosystem.

  1. We don't have a consistence interface for managing parameters of scientific models. Sometimes models come with parameters attached, sometimes they have to be attached at the end. If we want to do parameter search or parameter estimation in any kind of principled way, we need to fix this.
  2. We apply model compositions and then forget about the way models were composed. That is, we don't get a composed model with explicit subsystems, we just get a big model. This compounds the problem of parameters: it's hard to dig through a large model to set parameters.
  3. Semantic functors often require parameters. However, sometimes the category theory of model compositions does not want to deal with parameters, and it makes more sense to simply provide new parameters to the composed model that are not derived from the parameters of the composition arguments.

Other libraries that do modeling with significant numbers of parameters have consistent stories for handling parameters. For examples, see ModelingToolkit.jl and Lux.jl.

I propose the following steps to remedy this.

  1. We should have data structures for representing parameter spaces. That is, data structures which represent the parameter spaces themselves, not elements of those parameter spaces. A naive choice for this would simply be a Set of Symbols, but we should think about the design space. Specifically, taking the disjoint union of two parameter spaces should be easy. This could be accomplished by prefixing each of the parameters with a model name, i.e. the composition of a.b and a.b would be something like model1.a, model1.b, model1.b, model2.b. Then we should have LVectors that are parameterized on the model spaces.
  2. Operations that compose models should also compose the corresponding parameter spaces. In simple cases, this will be disjoint union as above, but in more complex cases (i.e. stratification), there will be more complex operations on the parameter spaces.
  3. One must pass in an LVector of parameters in order to access semantics.

I'm not sure if Catlab is the best place for this to exist, but there should be a consistent story across the AlgebraicJulia ecosystem.