Compile math library ahead-of-time and create a Stan interpreter

seantalts commented 5 years ago

(I'm going to try to start writing up issues for some of the ideas I have recorded over the past year or so for amazing things we can do with the new compiler infrastructure here. I'll apply this big-exciting-project label (open to new label names too 😅)).

So it is well known by Stan folks that compiling the entire supported Math library ahead of time is completely intractable, primarily because the number of vectorized and unvectorized types we support for every heavily templated function causes a combinatoric explosion in the amount of generated code. I've heard reports that just pieces of the math library compiled like this result in 100s of GBs of compiled code.

However, now we might actually be able to compile just a very small subset of the Math library where we just choose to e.g. only support Eigen types and no non-vectorized versions of functions. Then we can have a new backend change the Stan source code to move all scalars into 1-wide Eigen vectors; move all real[] to vector, and so on. There will be a few hangups with this:

1) Obviously there will be some additional allocations, but I hope it will be quite low and possibly optimized by the compiler. And once we have an EigenBase refactor of the Math library, we can use fixed size Eigen types on the stack without allocating, which should take care of most of the rest of this issue. 1) A couple of Math library functions were specifically written to only support std::vector<double>. We will have to either extend those or have the compiler translate before and after.

If anyone else can comment on the other potential pitfalls with this approach I'd really appreciate it. I'll try to keep this top-level comment up to date with additional content. //cc @bob-carpenter @bgoodri @syclik

Here's why this would be awesome: 1) Standard wins due to the immediate feedback cycle during Stan and Stan model development. These are pretty well-documented elsewhere [citation needed]. 2) We'd be able to distribute a version of Stan / RStan / PyStan that did not need any C++ compiler or toolchain at runtime(!!!!)

bob-carpenter commented 5 years ago

I was originally thinking of the same kind of array/vector distinction that Eigen made, but that never held up. So there's not much rhyme or reason to what's implemented where. The only thing we haven't done is allowed automatic casting of real[] to vector or row_vector, so there still aren't linear algebra operations allowed on real[] or real[,] types in Stan. Nor do I think there should be.

The big combinatoric hit is the probability functions. Most other functions require more coherent argument structures, so don't have the same combinatoric explosion. For these (and other functions that might be a problem), I was suggesting precompiling the heavy lifting using var and double argument versions.

Pitfalls are obviously (a) implementation, documentation, and maintenance costs and (b) performance. In terms of learning, it's yet another decision for users and another path (I suppose that could be a positive).

seantalts commented 5 years ago

I was suggesting precompiling the heavy lifting using var and double argument versions.

This is the one where we'd have to rewrite the Math library signatures for these, right? Or can we autogenerate something on top of the existing distribution functions?

Pitfalls are obviously (a) implementation, documentation, and maintenance costs and (b) performance. In terms of learning, it's yet another decision for users and another path (I suppose that could be a positive).

Pitfalls of which thing? Adding or rewriting those distribution signatures to be in terms of var and double?

I think for Stan 3 I might propose we just remove real[] from the language, but I don't feel too strongly about that.

bob-carpenter commented 5 years ago

I was suggesting precompiling the heavy lifting using var and double argument versions.

This is the one where we'd have to rewrite the Math library signatures for these, right? Or can we autogenerate something on top of the existing distribution functions?

The hard thing to compile is the var and double parts. I'm imagining something like this:

typename return_type<T1, T2>::type exp_lpdf_raw(const T1 y, long n_y, const T2 alpha, long n_alpha) { create views of y and alpha and do the computation }

and then something like:

typename return_type<T1, T2>::type
exp_lpdf(const T1& y, const T2& alpha) { return exp_lpdf_raw(to_ptr(y), size_of(y), to_ptr(alpha), size_of(alpha)); }

So everything now gets two implementations, original and raw. The hard compilation is of the _raw version where all the work of matrices, autodiff, etc. is happening. The to_ptr and size_of things are simple overloaded functions, not metaprograms. It's more complicated for multivariate inputs---I haven't thought that through all the way, but presumably it just needs more sizing.

Pitfalls are obviously (a) implementation, documentation, and maintenance costs and (b) performance. In terms of learning, it's yet another decision for users and another path (I suppose that could be a positive).

Pitfalls of which thing? Adding or rewriting those distribution signatures to be in terms of var and double?

I meant of rewriting using arrays. Mainly, it's just implementation and performance---don't know what I was thinking about everything else.

I think for Stan 3 I might propose we just remove real[] from the language, but I don't feel too strongly about that.

We could do that. It just gets hard to describe the type system other than to say it's this, but without real[]. I want to have automatic vectorization work, but I suppose we can disable_if vector.

stan-dev / stanc3

Compile math library ahead-of-time and create a Stan interpreter #254