MPLLang / mpl

The MaPLe compiler for efficient and scalable parallel functional programming
Other
306 stars 18 forks source link

Opportunity for performance improvement in CC #148

Closed shwestrick closed 2 years ago

shwestrick commented 2 years ago

See https://github.com/MPLLang/mpl/issues/147#issuecomment-1061287913

shwestrick commented 2 years ago

Essentially, the CC work list has a (typically small) space penalty because it does not exactly match DFS space behavior. The work list is a collection of objptrs that need to be traced, but this means that for large objects, we pay O(# fields) space to enqueue all of its fields.

Suggested fix: allow for CC work list entries of the form (objptr, field offset) so that we can complete a full DFS on one field of a large object while only paying O(1) space to remember how much of the object remains to be traced.

shwestrick commented 2 years ago

Another small optimization we could do is ensure that the mark/unmark loop happens as frequently as possible. So rather than enqueuing the full root set, we could enqueue one root, do the tracing loop, then enqueue the next root, etc.

For example, in the code, we could add markLoop(...) to the end of forceForward, and unmarkLoop(...) to the end of forceUnmark

shwestrick commented 2 years ago

This is implemented by #150 💪