Caching opt-problems to speed up compilation.

multigrid101 commented 6 years ago

Hi there,

I am currently working on some cpu backends for Opt.

I have scripts that perform tests and due to the large number of available options (useOptLM, backend, number of threads, .....) there actually is a large number of tests to perform.

A significant bottleneck to test-performance is the compilation of the problems (At the moment about 20 opt-compiles per example, i.e. hundreds in total.) It should be possible to run these in seconds (not minutes). Also, the minutes will likely increase to hours if more features (e.g. solvers) are added.

So far, this has not really been a problem but especially with the new backends present (frequent) testing will be important.

I don't really have any concrete suggestions yet, I am mostly looking to start a discussion on this. I don't think that caching is really an issue from a user-perspective (this is already achieved by having separate API function for problem definition and problem solution).

Some thoughts:

The automatically generated functions (e.g. PCGStep1) seem to be constant for most solver-settings, so it should be possible to cache at least those. They only account for about half of total compilation time though...
The solver itself depends on the backend, is caching even possible in that case?

Mx7f commented 6 years ago

I am a bit confused about the parameters of the tests. Is the idea that you will be re-running the same tests without changing the energy, and so you only want to pay the cost of compilation once per backend? Why would you run the exact same test without needing to recompile?

multigrid101 commented 6 years ago

Ok I admit that my question was too general.

Maybe let's focus on the existing codebase for now and consider the new backends at a later point.

As I said, from a user-perspective, there shouldn't be too many problems anyway while running single solves on large problems, but let's consider the following, very simple, scenario:

User writes an energy along with some C code. The C executable is supposed to perform e.g. denoising on a picture. So User now has a program to denoise images and wants to apply this algorithms to many pictures, most of which aren't very large. Thanks to the speed of Opt's gpu backend, the denoising itself takes a lot less than a second, so the denoising of all of his pictures shouldn't take too long.

(We could vary this scenario so that the executable also takes the weight of the regularization-term as a command line arg, which would be a typical scenario for a researcher preparing a paper on a revolutionary deniosing method)

Unfortunately, the compilation of the energy into opt-code has to be done every time the executable is run, thus increasing the total "denoising time" from minutes to hours.

I am, of course, exaggerating a little but I think that this scenario could serve as a good starting point for the discussion. Changing the backend will, of course, require re-compilation of at least parts of the solver, so let's not consider it for now.

Mx7f commented 6 years ago

Got it. There are effectively two bits of functionality in one for this request: the ability to save and load a solver, and then using said functionality to make a cache, which could just boil down to hashing the energy function and any relevant compile-time parameters (say, specifying the backend), saving and loading the binary to a specific location with the hash as an identifier.

Saving and loading is the part with unknown difficulty. Terra has a saveobj function that we could attempt to use to save init() and step() for a particular solver instantiation. I do not have the bandwidth to attempt this myself at the moment; it could be a one-day project or there might be unforeseen problems that spiral into something larger. Either way, I am interested, and would like to come back to this later.

multigrid101 commented 6 years ago

Yes, you are right. I guess focusing on the saving and loading part (as soon as time permits it) might suffice for the beginning.

The next problem down the line will be "running all tests takes too long", but I guess its best to come back to this after you had a chance to look at the new backends (and tests).

EDIT: I used to use a DSL very similar to Opt (but in a finite element context), called fenics. It has been around for a while and seems to be widely used and mature. Maybe we can look for ideas about the whole caching thing there when the time comes. Here's what I (vaguely) remember:

There is a command line tool to convert "energy" specifications into "something" that can be loaded by the C API. Afaik, the conversion can also be achieved from the C API itself.
There is caching going on to prevent unnecessary re-compilation.

niessner / Opt

Caching opt-problems to speed up compilation. #113