jckarter / clay

The Clay programming language
http://claylabs.com/clay
Other
404 stars 34 forks source link

Use BumpPtrAllocator for Object allocation #365

Closed ghost closed 12 years ago

ghost commented 12 years ago

Significantly reduces number of allocations during parsing and compilation.

This is pretty simple implementation designed to minimise source changes, but seems to work ok. Maybe addresses issue #123.

jckarter commented 12 years ago

Is there any compile-time performance improvement with this patch? How does memory usage compare?

ghost commented 12 years ago

Significant parse/compile time improvements when compiling individual projects. Test suite time dropped ~20 secs ( ~8%). Allocations seem to be halved at worst for small modules. The default slab allocation is fairly small (4096 bytes) so memory usage is similar. Larger slabs made negligible difference as most allocations are around 100 bytes.

Be interesting to see how well this works on other platforms.

jckarter commented 12 years ago

Well, BumpPtrAllocator never actually releases memory, so I think compile-time-computation-heavy code that allocates lots of temporaries will use more memory or run out of memory. For instance, one of the cocoa.* tests on OS X tops out at 1.89GB with your patch. While it's sensible to leak AST nodes, invoke tables, overloads, and other objects that are required for the lifetime of a compile job, I think there needs to be a smarter allocation strategy for EValues and CValues used for evaluation and codegen. Maybe a stack of BumpPtrAllocators mirroring the callstack could be made to work, though there will probably be problems with passing objects up the stack.

jckarter commented 12 years ago

OTOH, the reference counting for BumpPtr allocated objects is unnecessary (since the object memory is never reclaimed anyway), and removing it would probably give even better performance improvements.

ghost commented 12 years ago

Joe, can you try your OSX test with the bump alloc restricted to ANodes. It still give a significant drop in allocations and increased performance.

ghost commented 12 years ago

I did try this with the ref counting disabled and it didn't have much impact. Using the bump for just ANodes makes this a moot point anyway as the other Objects need it.

ghost commented 12 years ago

Oh, the joy of learning C++! OK, i'll see if I can make the code look a little uglier and less intuitive, as the designers obviously intended.

jckarter commented 12 years ago

Running the tests with only ANodes bump allocated works with about the same memory footprint as before, although it's a bit slower—bump-allocating everything takes 713s, while bump-allocating only ANodes takes 801s. (That's with LLVM compiled with Debug+Asserts.)

jckarter commented 12 years ago

And of course, someone on a Mac that doesn't have 8GB of RAM would get a much slower count for the former case.

ghost commented 12 years ago

Did you see any increase in performance with this? Do the cocoa tests still require huge amounts of memory?

jckarter commented 12 years ago

Memory usage is the same as before. There's a small speed up running the tests.