Defining a Parameter/Structures Interface

ChrisRackauckas commented 2 years ago

Guiding issues:

Guiding designs:

For a long time we were able to get away with just "let p be any struct", though an edge case came up, and then a second one, and then a third one, etc. until it became clear we need to at least have an interface defined for "what can be done with p". Similarly, we have allowed a loose definition of u0 as "anything that can broadcast effectively, and if required can do linear algebra", but this has become limiting.

The main issue really started cropping up with adjoint sensitivity analysis. In that case, one might want to extend the differential equation to include p, e.g. [u0;p] for InterpolatingAdjoint, which then means that p needs to be able to "act" like a u0. If that's the case, then if we assume u0 needs to be able to broadcast and do linear algebra, suddenly this requirement is also on p. While this works for "most" p's that people use in practice, it does impose a limitation where using say a tuple of arrays for parameters works for "standard" equation solving, but fails with adjoints.

This then requires that we ask the question: what is "the" SciML parameters or structure interface? Currently it has been essentially AbstractArray, since AbstractArray is the only thing that effectively defines linear algebra, with extensions allowed with extra functionality is not used. However, it's probably more than overdue to both tighten and enlarge this interface by directly defining what it's actions should be.

If we do this right, then people should be able to use weird structures everywhere and have things "just work", or tell them which functions to define in order for things to just work. How do we get there?

What should a SciML Parameters interface do?

It needs to be fast and type-stable. If it's not that, people with ignore it.
It needs to be overloadable. If someone's structure doesn't work, they should have a clear way to make it work.
It needs to be able to throw appropriate errors about missing structural information. For example, which function needs to be defined.
It needs to support standard Julia types out of the box.
It needs to have clearly defined behavior for restructuring and destructuring to/from arrays. In particular, restructuring should always use the values from the restructure vector, and not change it like the Optimisers.jl behavior on Complex (https://github.com/FluxML/Optimisers.jl/issues/95)
Restructuring/destructuring on the simplest types needs to be a no-op. This is required or else supporting calls to C/Fortran is almost impossible, which is required for many wrapper solver libraries (for example, supporting Sundials) but also even things as pervasive as supporting linear algebra.
It needs to support naming and named indexing. This would give ModelingToolkit a way to interpret code down to its DSL in a way that preserves naming, and generate structures that are more interpretable for users than its flat vectors.
It needs to be able to add tags that some things are "parameters" while other things are not. What things should optimize, what things are constant.
Non-SciMLStructure p's still need to work in the solvers: e.g. only the places that need more control should check isscimlstructure (and if false, throw an error that this is required).

Ideas from Other Libraries / Do we need a separate library for this? Can we use another one?

This would be core enough to SciML that we would need to make sure we had admin rights all over it and the ability to hotfix etc. as necessary, so from that fact alone it would need to live in SciML. I think that's a no brainer.
There have been a number of factors for why it was not adopted more widely throughout SciML, the main reason being that restructure/destructure was undeniably broken until fairly recently, where it still has some unintuitive behaviors for the purpose of equation solvers (https://github.com/SciML/NeuralPDE.jl/issues/533, https://github.com/FluxML/Optimisers.jl/issues/95). However, the general idea of Functions.jl is a good starting place, and I think no matter what structure we settle on we should support Functor definitions at least as a subset of the interface. It (and nothing else) check all of the boxes, like having a hierarchical naming system for ModelingToolkit.
https://github.com/Metalenz/FlexibleFunctors.jl might be a better starting point than Functors.jl (to the point where starting as a fork may be helpful). Its design is very simple and intuitive, but it's missing a lot, like potentially necesssary mapping operations, hierarchical naming, etc.
https://github.com/rafaqz/ModelParameters.jl and https://github.com/jonniedie/ComponentArrays.jl come from a very different angle and support the hierarchical naming, but require very specific type wrappers, meaning they aren't "interfaces" but types. While their types are great (and a current good solution for using DiffEq), if we're going to modify the solvers we probably want to do it around interfaces so that "simple" and intuitive types that users gravitate towards (like arrays of arrays) are supported.

Questions

Do we extend the solvers to allow for a SciML Structure/Parameter for `u0` or always flatten?

This is potentially a considerable amount of work, but with a potentially considerable amount of gain. It would be an interesting solution to the adjoint sensitivity problem: if [u0,p] "just works" because arrays of arrays just work because arrays of arrays are supported in the structure interface, then some code becomes easier. But other things, like how to define mass matrices, become 😕. Maybe all matrices need to only be defined on the flattened form, and then LinearSolve.jl can do the restructure/destructure to make it all work nicely?

Supporting this in OrdinaryDiffEq.jl is probably not "too" hard. Every 2 * k1 would need to be like a fmap(*,k1,2). It would probably be best to just define that as an operator 2 ⊗ k1 or something so the code is easier to read, but that would up the barrier to entry. For the broadcast operations, it would just require that we specialize @.. effectively, which would probably be the hardest part (if all(isSciMLStructure,vals) -> special dispatches?). @.. would no longer just be "Fast Broadcast" anymore though, since it would have a different definition/operation on things like arrays of arrays (but then would allow their support without requiring a RecursiveArrayTools.VectorOfArray wrapper, which groups like DynamicalSystems.jl have been asking for years).

For other solvers, we'd have to do destructure/restructure in the wrapper code so it just presents as arrays to the underlying solver. To not have a performance hit, we will need (6) to hold.

I think we should do it, but it will be a challenge.

Do we provide a tables interface for reading/writing to data?

I can see this as very valuable for people working with CSVs and the such. https://github.com/rafaqz/ModelParameters.jl showed it's possible.

ranocha commented 2 years ago

I would like to emphasize a requirement (that is somewhat hidden in the list above): If people do not use something like sensitivity analysis, they should not need to care about this and can still use everything they have used before. For example, we pass our semidiscretization as p in Trixi.jl, containing numerical parameters, the mesh, caches for mutating operations etc.

ChrisRackauckas commented 2 years ago

Yes important point, added to the list.

ArnoStrouwen commented 2 years ago

These also seem related: SciML/DiffEqParamEstim.jl/issues/107 SciML/DiffEqParamEstim.jl/issues/102

ToucheSir commented 2 years ago

https://github.com/Metalenz/FlexibleFunctors.jl might be a better starting point than Functors.jl (to the point where starting as a fork may be helpful).

Note that FlexibleFunctors is now part of Functors.jl.

See also the monster discussion in https://github.com/invenia/ParameterHandling.jl/issues/43. TL;DR this is fiendishly complex to do in general, but it would be amazing to have more standardization around interfaces.

ChrisRackauckas commented 2 years ago

We don't have to be too general, just have an interface which provides everything SciML needs, and update it as SciML needs more.

briochemc commented 2 years ago

Suggestion: Add https://github.com/paschermayr/ModelWrappers.jl to the guiding designs?

ChrisRackauckas commented 2 years ago

I talked with @YingboMa and the approach we're going to go with is to prefer canonicalization, i.e. the internals to the solvers won't need any fmap or special broadcast because everything will be represented by simple arrays, and then this would mean that we're restructure/destructure at the boundary to the user (f calls and getproperty overloads) to mask this.

isaacsas commented 2 years ago

Does that have any potential (performance?) impacts on using parameter objects with mixed types (i.e. including a mix of functions, floats, ints, etc in a type-stable manner)? Or would this only be enforced in places where having a fixed type for parameters is needed anyways for the problem being solved to make sense?

ChrisRackauckas commented 2 years ago

It should be more efficient. We plan to have a non-allocating restructure!/destructure!

lamorton commented 2 years ago

Edit: Thought this was under ModelingToolkit. Making a separate issue over there instead.

vpuri3 commented 1 year ago

All,

We are redesigning the operator interface to SciML.ai in https://github.com/SciML/SciMLOperators.jl. This issue has come up in discussions on how to pass in parameters to function objects. The problem here, as I see it, is that there are two kinds of parameters we want to pass to functions/operators:

parameters we want to do automatic-differentiation/ adjoint calculation on
operator specific context or struct objects that contain mesh, numerical parameters, discretization information, etc.

The former can in most cases be cast as an AbstractArray, while the latter is almost always a complex data-structure. Currently SciML only permits passing in one parameter via the p positional argument in f(du, u, p, t). Problems arise when one has to do adjoint calculations with p when it is a complex data-structure. Obviously writing restructure, restructure methods for all the data-types in p can become cumbersome.

https://github.com/SciML/SciMLOperators.jl/pull/143 is solving this problem by allowing users pass the two types of parameters separately. The proposed design would allow you to pass array-like AD parameters in p, and function-dependent data-structures using keyword arguments:

    # Test scalar operator which expects keyword argument to update, modeled in the style of a DiffEq W-operator.
    γ = ScalarOperator(0.0; update_func=(args...; dtgamma) -> dtgamma, accepted_kwarg_fields=(:dtgamma,))

    dtgamma = rand()
    @test γ(u,p,t; dtgamma) ≈ dtgamma * u

ChrisRackauckas commented 8 months ago

https://github.com/SciML/SciMLStructures.jl is the solution.

SciML / DifferentialEquations.jl