jcrist / symcc

Experimental mathematics compilation library for SymPy
Other
10 stars 3 forks source link

Merging this into SymPy #7

Open asmeurer opened 8 years ago

asmeurer commented 8 years ago

I'd like to get this merged into SymPy.

As you may know, I'm working on a project to improve the code generation in SymPy. Based on my analysis of what's out there (which somehow completely missed this) I came to the conclusion myself that we need something exactly like this, an AST representation of high-level code objects.

So my questions are:

jcrist commented 8 years ago

Do you agree that this should be merged into SymPy?

Perhaps, although I wouldn't do a direct copy-paste. Some things in SymPy have diverged since I forked the codegen stuff to here.

Are there any things here that you would change (especially API-wise)? Reading through the code, nothing has stood out to me yet as something that I would do differently (OK, one minor thing, but it's trivial to change).

I still think code generation should happen in a few passes:

In this vein, I see the approach taken here (but by no means completed) to be a valid one. One thing I'm unsure about is the creation of RoutineResult objects, which play well with SymPy's expression system. The intent here was to more easily share sub-functions generated in modules. You could generate a function for one expression, then use it in a larger expression, and when code gen was run, both functions would be included in the generated code. Would be a lot easier to auto-generate a library of functions that way, and also broke sub-routines up into logical concepts, as defined by the programmer. These kind of worked, but were buggy in their own unique way, mostly with how they played with the assumption and simplification system.

In general, it's been long enough since I wrote this code, or even thought about it, that I'm not sure anymore what the best next step is. Definitely feel free to ping me, and I'll try to be helpful, but you've probably thought about this a lot more recently than I have.

asmeurer commented 8 years ago

Perhaps, although I wouldn't do a direct copy-paste. Some things in SymPy have diverged since I forked the codegen stuff to here.

Can you be more specific what you mean here? Obviously the printer stuff that you copied from SymPy shouldn't be copied back (I was planning on checking the commit history to see if any important changes need to be moved).

I agree with your "passes" workflow. For the rewriting pass (which is the meat of what I actually want to do with this project), I would focus more on mathematical optimizations (stuff that compilers can't do). Stuff like inlining and loop merging are cool, but (at least for compiled languages), probably completely unnecessary, unless you have to use -O0.

In fact, many of the optimizations I am interested in would happen before the lowering pass, where we still have a mathematical representation of the object.

I also want to make the code ast objects usable enough that people feel comfortable using them themselves in their own code generation code (outside of SymPy). I see a lot of code generation stuff that uses ccode to convert expressions, but glues them together using templates or a bunch += lines (example), and I think it's an antipattern, because it's difficult to read and maintain.

Regarding the Routine thing, I haven't looked at those classes yet (still looking at the basic ast classes), so no comments there yet.

asmeurer commented 8 years ago

OK, I've read a little more.

The Routine stuff seems a bit confusing, though not as confusing as the stuff currently in SymPy, so that's an improvement. The commit message of https://github.com/jcrist/symcc/commit/668da9ac3b6a2e2ed16bcfb8577d87651a4874a5#diff-2af53fbf1023a006b9f421744c9c4ebf really helped me understand what they were for, so if we end up keeping it, we should copy that into the docs.

I'm a little unclear why a separate Routine class is needed, though. Wouldn't it be enough to use FunctionDef? If you do that, then RoutineCallResult (which is a confusing name by the way) would just be f(x) where f is Function(name) and name is the name used in the FunctionDef. Is it possible to have a routine that won't be printed as a function definition in the generated code?

Please disprove me if I'm wrong on this, but my gut feeling is that proxy objects for function calls to pull in assumptions are not useful. In the simple case (i.e., the case where we could actually compute some assumptions), the function (or routine, if you will) will just be a single expression, or reducible to a single expression. In that case, I'd rather see higher level (not specific to codegen) abstractions for making it easier to define a function representing an expression.

In the more complicated case, a routine will have loops, conditionals, and other things that would make actual computation of the assumptions on the final result more difficult. More to the point, though, the assumptions system is designed to work on the mathematical level of abstraction (and ditto for simplification). Sum may know what its assumptions are, but For won't. Similarly, Sum may know how to evaluate, or at least to simplify (e.g., by pulling out a constant), but For won't, at least without converting to Sum first. We probably could do stuff at the codegen AST level, but stuff in SymPy is already implemented at the mathematical level, so it's better to keep things there whenever possible. I suppose an exception might be if we want to use the assumptions system to do type inference (my personal goal for now though is to mostly ignore types and just assume everything will be a double).

So my feeling for now is that the routine stuff is not needed. It does do some things that are not so easy to do, but I think most of those things would be better improved at the SymPy math level than at the codegen AST level (like making it easier to create custom Function objects that represent expressions.

I want to start by copying ast.py to SymPy, probably under a new sympy.codegen module (I think it's about time we had one).

asmeurer commented 8 years ago

https://github.com/sympy/sympy/pull/10486