boo-lang / boo

The Boo Programming Language.
BSD 3-Clause "New" or "Revised" License
874 stars 148 forks source link

Meta methods cause a serious performance hit via inefficient reification #140

Closed masonwheeler closed 7 years ago

masonwheeler commented 8 years ago

I've been doing some torture-testing on large codebases, and I just ran into something interesting: every time you expand a meta method, it potentially causes the compiler to do multiple macro and attribute expansion passes over the entire codebase.

If you think "hey, that doesn't make any sense", you're absolutely right. But here's what happens:

The compiler invokes the meta method and gets back its result, which must be either an Expression or an ExpressionStatement. Expressions can be of arbitrary complexity, of course, potentially containing several nested subexpressions. Because the expression may well have been created using AST literals instead of BooCodeBuilder, it runs a CodeReifier over it to ensure everything's been processed correctly so far.

CodeReifier is a bit of a strange beast. It runs a visitor over the passed in node tree, and for every individual node that's a Statement, a TypeReference or an Expression, (which is pretty much everything,) it runs over every step in the pipeline that's gone so far, checks if it supports reification, and if so tells it to reify that node, which usually means running its visitor over that node.

The exception, though, is the MacroAndAttributeExpansion step. Its reifier method looks like this:

        public Expression Reify(Expression node)
        {
            ApplyAttributesAndExpandMacros();
            return node;
        }

Observe the notable absence of actually doing anything with the node that was passed in, such as passing it to the method that does the work. With nothing to limit it, the ApplyAttributesAndExpandMacros method processes the entire codebase. For every Expression, Statement, and TypeReference in the tree being reified.

There are two problems here. First, a meta method can't return a MacroStatement. (Although it could return a BlockExpression that has a MacroStatement somewhere inside. Or, for that matter, it could be abused to insert a macro invocation (or an AST attribute) anywhere in the codebase, as it has access to the AST to execute arbitrary code upon it.)

Second, if a meta method invocation did create a MacroStatement or an AST attribute, the reifier only runs one ApplyAttributesAndExpandMacros pass, rather than running in a loop to make sure you don't end up with macros that expand to other macros. (If we're going to be paranoid about this, we may as well go all the way, right?)

Because of this, code that makes heavy use of meta methods (or any other feature that uses reification) finds its compilation slowed quite unreasonably.

masonwheeler commented 8 years ago

One possible solution, though it feels like a bit of a hack, would involve the following changes:

  1. Put a static int field with a read-only property associated with it on the MacroStatement and Attribute classes. Each time either of these classes is instantiated, the constructor will increment the field value by 1.
  2. Put a static int field, for private use only, on the MacroAndAttributeExpansion class. When a Reify method is called on this class, it compares the value of this field to the sum of the counters on MacroStatement and Attribute. If they are the same, it exits without running reification, as no new macros or attributes have been created since the last time we saw it. Otherwise, it sets its internal counter to this sum value and then runs reification.
  3. Make MacroAndAttributeExpansion call RunExpansionIterations rather than ApplyAttributesAndExpandMacros when running reification, just to be safe.
  4. At the end of MacroAndAttributeExpansion.Run(), set the counter value, to avoid an unnecessary reification pass the first time.

It's a hack, but it feels like the simplest change that would actually work. Any comments before I go and implement this?