Use tearing information directly in codegen

Keno commented 2 years ago

At the moment, after tearing, we symbolically substitute solved equations back into the original system and then codegen that. This makes little sense. Substitution is essentially free at codegen time by using an SSA value in place of the original variable. Worse, by substituting ahead of time, the compiler needs to do significant work to CSE the substituted equations back into the form that we already know it should be in. #1384 does some refactoring in this direction, but not all the way. We should finish this.

ChrisRackauckas commented 2 years ago

There are a few things I would like to keep.

One is that the symbolic representation of the torn problem is useful for things like debugging and sharing the result, i.e. generating LaTeX of the simplified solution for reporting. So there's always that use case, which at least means it would be good to have the option of doing the substitution. This is a bad reason to default to the symbolic output form though, and points to having a different form as the default as reasonable.

Another aspect is that giving it a symbolic representation means that it is a fully composable AbstractSystem -> AbstractSystem transformation. If there's some other hidden information around, then the result may not be compatible with simplifications post tearing. Maybe that's okay because it generally has been used as a final step as basically a codegen process, but it doesn't necessarily have to be used like that. For example, what about tearing on the drift equations of an SDEProblem and applying https://github.com/augustinas1/MomentClosure.jl on the torn solution to generate an ODEProblem? It may be a lot better to do such a large expansion only after simplification. Similarly things like domain transforms (exponentiating variables to keep positivity), should those always be required before tearing or could those transforms come after? If an IR form is fully supported alongside the symbolic form, then this could be solved, though all transformations would need to be written to be compatible with the potential IR form which can cause some difficulties. But it's plausible.

A third is post-tearing symbolic sparse Jacobians. As mentioned before, sparse automatic differentiation can be as bad as O(n) for n columns or rows vs O(1) for the symbolic sparse Jacobian even when the sparse AD is done absolutely perfectly. So there is a major performance limitation to using AD in a lot of important cases here. Of course, with the code growth, the sparse Jacobian shouldn't always be used but it is a very important option for many scenarios (probably <10,000 ODEs post tearing or something of the sort, BCR is a nice test example where symbolic sparse Jacobians do well while AD is currently untenable). If the result is symbolic, then generating the fully symbolic representation for the sparse Jacobian is rather trivial. If what is returned is some IR form, then we would need some way of generating symbolic sparse Jacobians from an IR form, which would likely be to do the substitutions to a symbolic form. If a system for the first problem (debugging/Latex/etc.) exists, that could be used here.

(Note that for the IR form, jacobian_sparsity would be able to be ran to get the sparsity pattern for sparse AD. That said, there may be some better way to do the result of symbolic sparse Jacobians: currently it generates the symbolic form to generate the symbolic Jacobian and then do codegen which then has to CSE away a bunch of redundancy. Maybe there is a way to generate the post-CSE IR for generating a sparse Jacobian more directly. But note that is an interesting question that I don't think we have an answer to since sparse AD tricks do not give something that is similar, and indeed give something that can be a lot worse in terms of runtime asymptotic performance.)

Keno commented 2 years ago

I'm confused about the sparse jacobian comment. We can read off the sparsity pattern from the bipartite graph, which gives for every equation which variables it needs to be differentiated with respect to. That's easy to do either symbolically or using AD.

ChrisRackauckas commented 2 years ago

What I mean is that sparse automatic differentiation, i.e. giving the sparsity pattern to https://github.com/JuliaDiff/SparseDiffTools.jl, has plenty of cases where it is asymptotically a lot worse than the code generated by symbolic differentiation, and that is rather fundamental to the method because of how forward/reverse AD works by columns/rows and thus will have to avoid a form of perturbation confusion when doing multiple columns/rows at a time. So generating an IR and a sparsity pattern for use with SparseDiffTools.jl is a form of a solution for getting sparse Jacobians but it will be by design a lot slower than something that can give a symbolic sparse Jacobian. So what I'm saying is that if there is an IR form result, then for cases where a symbolic sparse Jacobian is a feasible thing to construct as an optimization, then we should be able to do it and the symbolic substitution form is a trivial way to get there. That said, if there's a specialization for sparse_jacobian which uses the tearing information and the IR to generate the symbolic Jacobian easily without ever using the substitution form, that's probably better.

I think that keeps pointing to a general point though that, when acting on a post-torn system, there are ways to make other transformations, sparse_jacobian, etc. all work without using the symbolic substitution form, but every function that wants to be able to do this will need new specializations and something quite a bit more difficult than the naïve algorithm. Thus the symbolic substitution form is essentially a fallback that can be used until more sophisticated specializations exist for the post-torn IR forms.

ChrisRackauckas commented 8 months ago

This was completed awhile back in different forms.

SciML / ModelingToolkit.jl

Use tearing information directly in codegen #1387