lihaoyi / macropy

Macros in Python: quasiquotes, case classes, LINQ and more!
3.28k stars 178 forks source link

Improve macro load performance #43

Closed finiteloop closed 11 years ago

finiteloop commented 11 years ago

We have a large code base, and macropy caused the import time of our project to go to about 3s on most people's laptops. These changes, which reduce AST walking, reduced it to about 1.6s.

lihaoyi commented 11 years ago

Looks good; thanks for the patch =) We didn't think of any perf improvements, since we don't have any large code base and have no real idea how much it's slowing things down. How does 1.6s compare to the un-macroed import time? Is it a big deal?

finiteloop commented 11 years ago

It is a reasonably big deal for us. Lots of excitement about the macros we are using, but tempered by a noticeable increase in the start time of people's scripts and servers, which slows down development. We will look at a few more optimizations when we have time and will send pull requests if any are impactful.

lihaoyi commented 11 years ago

Yeah, I'd expect that walking over the AST of a large python program will take a good long while. CPython isn't particularly fast, and PyPy doesn't seem to be any better from a "how long does our test suite take" perspective.

One potential approach would be to perform aggresive if .... not in sub_string culling, similar to what your patch does on a file-level, using the offset of each AST node to bound the searched sub_string. It would be a conservative heuristic (at least without proper source maps), but should still be enough to quickly cull large macro-less sections of the Python AST to avoid having to recursively walk the things.

It would be great if you could provide a sample "this takes too long!" benchmark, then I (and others) would be able to attack the performance issue too.