coder-mike commented 1 year ago

Idea: can closures get even smaller?

At the moment, a minimal closure and its enclosed environment take essentially 10 bytes of base memory (+2 bytes for each user-defined variable):

2021-08-15-closures-memory-layout

The closure itself, with a 2B allocation header, 2B scope pointer (pointer to the enclosed environment), and 2B pointer to the corresponding bytecode.
The environment, with a 2B allocation header and 2B pointer to any outer environment (+ 2 bytes for each user-defined variable)

I described this in this blog post.

I thought this was basically minimal. Multiple closures can share the same scope (environment) so it would seem that they need to be distinct allocations. And a closure at the very minimum needs to be a tuple of a function and its environment, right?

A better design?

But while I've been brainstorming the design of async-await, I've thought of a potentially-smaller solution for some closures.

2022-09-18 revised-closures

This design brings the size down from 10 bytes to 4 bytes (plus any user-defined variables). In the example in the image, with a single user-defined variable, this brings the size down from 12 bytes to 6 bytes -- half the size.

It does so by:

Unifying the closure with its environment, so it becomes a single allocation. This saves 4 bytes.
Removing the pointer to the outer environment if it's not needed (which I suspect is a common case), which saves another 2 bytes.

With this design, when you execute a closure, the engine would look at the first slot in the closure to identify the function to call, and treat the whole allocation as its environment.

The unification of a closure with its environment only makes sense for one closure in an environment. E.g. we could just pick the first closure and inline it into its environment. So the first closure in a certain scope will take 4 bytes (because it's inlined) and the second closure will take an additional 6 bytes because it needs the function pointer plus a reference back to the environment (which also happens to be the first closure).

Also, the elimination of the parent pointer slot only makes sense if there is no outer scope. The global scope and top-level module scope are not considered to be closure scopes, so I think this may be true a lot of the time. When it's not true, one extra slot will be needed.

An async-ready design

This design fits nicely into the plan for async/await because the unification of a closure with its environment means that a closure can easily modify itself using normal variable-accessing instructions (in the earlier diagram, x is a variable, but function is essentially like a variable as well, from the perspective of the bytecode). This allows us to have closures that modify their own function pointer over time (at the bytecode level, not as a user-level capability). So an async function could then be implemented as essentially a closure that re-targets its function pointer to the current await point.

Personal reflection: should I do it now?

Closures currently work as they are, and I don't want to spend endless cycles micro-optimizing. But I'm really torn on this one because going from 10 bytes to 4 bytes is quite a big jump. A lot of functional-style code could really benefit from this. But then there are still so many things I want to add to Microvium and not sure this is the most important.

coder-mike commented 1 year ago

For anyone following this ticket, I've written up a more detailed description of the design here

coder-mike commented 1 year ago

This is done and merged! It will go out in the next release.

coder-mike / microvium