Native code and bypassing the interpreter

jlumpe commented 6 years ago

It seems like many of the interfaces in this project are to libraries written in C with a Python wrapper. Most of these libraries typically take a pointer to a C function which computes the function value(s) as an argument. Have you considered supporting printing symbolic expressions to C code and compiling them into callbacks that can be passed to the library directly? This could provide significant speedup both by doing the function evaluation in C and by avoiding the round trip through the Python interpreter. A complication is that different libraries expect callbacks with different signatures, but it that seems solvable.

bjodah commented 6 years ago

I certainly have. In the sister project pyodesys we do exactly that. The only reason to why I haven't in pyneqsys is "simply" lack of time. The speed-ups are, as you guessed, significant.

EDIT: here are the files for native code generation in pyodesys: https://github.com/bjodah/pyodesys/tree/master/pyodesys/native

jlumpe commented 6 years ago

Aha, yes it looks like the native code generation in pyodesys is pretty advanced. The reason I ask is that I have been considering developing a similar package that presents a unified interface to different nonlinear optimization libraries. It seems like these are very similar things at their core though, in that we are investigating the properties of nonlinear functions from R^n to R^m (n == m in the case of ODEs). I think with some minor extensions to the framework you've developed here and in pyodesys you could have a universal interface for root finding, curve fitting, nonlinear optimization, ODE solvers, and many other numerical analysis algorithms.

bjodah commented 6 years ago

Sounds like an interesting project. It sounds similar to what I envision for these packages on a longer term as well.

If you want to improve upon these packages you are most welcome to submit pull requests. If the change is controversial (e.g. breaking backwards compatibility etc.) just open an issue before putting too much work in.

I did write a unified C++ interface to initial value problems for systems of ordinary differential equations:

https://github.com/bjodah/anyode

I then use that interface in pyodesys. A similar approach would probably be viable for pyneqsys.

Some status-updates and lessons learned:

Right now the code generation in pyodesys doesn't really do "outer loops" (arising e.g. from discretization of PDEs by method-of-lines), at some point I want to fix that. Generating the code for the loops is fairly straightforward, but realizing what it means for the structure of the Jacobian and inferring what solver to use based on that is a bit more work, especially since not all underlying libraries support the same feature-set (SUNDIALS really shines here). My current roadmap is to make pyodesys extensible and then make e.g. a "method-of-lines package" relying on pyodesys. DAEs are also on the roadmap (will use IDA from SUNDIALS).

The same goes for pyneqsys (which uses lambdify, which—as you know—is also a form of code-generation): non-linear optimization e.g. residuals in curve-fitting problems involves "looping over" all the data points, generating 10 000 lines of code for something that is essentially a small loop over 10 000 elements is obviously a no-go. As for underlying libraries I really want to make IPOPT support in pyneqsys first class. IPOPT used to be a pain to compile (having to apply patches to dependent libraries and what not), thanks to conda-forge however installing it is a breeze.

Another humbling experience is that it is quite easy to under-estimate how big of an undertaking this is (and not only due to design & implementation, but also from all kinds of issues with version mismatches of libraries, compiler bugs & cross-platform issues).

Hope you find some of it useful.

bjodah / pyneqsys

Native code and bypassing the interpreter #39

Some status-updates and lessons learned: