WebAssembly / interface-types

Other
641 stars 57 forks source link

Finalization and linear allocation #7

Closed magcius closed 5 years ago

magcius commented 6 years ago

The Overview document says that there is no provision for finalizing or freeing a table slot, but proposes that an index to a pending slot could be used.

That, of course, won't work for more complex allocation patterns (e.g. I retrieve three DOM elements, free the first and third ones and keep the second around in a global -- the third one gets the pending slot, and we can't free the first one). This means that such an object will be leaked forever, with no ability to unroot the object in it.

Proposal: If the copy_elem proposed in issue #4 is implemented, one could finalize a slot by copying an uninitialized null slot over it. The slot would still not be able to be reused with a simple "pending slot" approach, but at least the object doesn't leak forever.

For a more robust solution, the WASM code should track free slots at runtime. For that, a companion swap_elem opcode to sort the table into short-lived "stack" slots and long-lived "heap" slots would be a bonus, but one could obviously do it the long way with three copy_elems.

The expectation so far seems to be that slots are stack-allocated and short-lived, but I don't see that being the case in real-world code. This can also be seen in the NEXT_SLOT global approach to export binding object allocation. Perhaps in the future, other approaches could be explored for allocation strategy (e.g. formalizing separate "stack" and "heap" spaces. One cheap, common idea is the lower end of the table space be reserved for long-lived "heap" objects, and a NEXT_SLOT global that counts down from the end of the table be reserved for short-lived "stack" objects)

lukewagner commented 6 years ago

So the way to think about these tables is that they are just "typier" JS arrays that are held by the wasm Instance object. Thus, when an object reference is passed in and goes into the Table (read: JS array), it is naturally kept alive by the Instance until either the Instance itself is unreachable or the element is cleared. So I think the answer to your basic question is that yes, in addition to copy_elem, we've talked about a set_elem_null op (maybe to be complemented by an is_elem_null op as you've pointed out in #5.

(@flagxor It'd be good to add a new section to the proposal that mention these ops.)

Bug given this model, I don't really see a route for adding finalization or automatic management of the slots by the engine. I do think wasm+JS/web binding does really need finalization or some form of postmortem notification, but I think that has to be a separate feature from host bindings (that may perhaps integrate with host bindings?).

magcius commented 6 years ago

I would like to see some thought put towards allocation strategy. The pending_slot stuff clearly isn't robust since it can leak slots forever. I expect that compilers will implement a "heap" object table which is expensively tracked and a "stack" object table where you simply store the height of the stack. Anything function-local goes into the "stack" table, anything that's global needs to be copied to the "heap" table. For bonus points, a set_elems_null which takes offs/len would be incredibly helpful for stack cleanup.

That said, this can wait until after prototype. A set_elems_null can easily be done in wasm.

lukewagner commented 6 years ago

Yeah, I agree that, for the specific case of returning a reference from a table from wasm to JS, we should be able to provide a better mechanism for the stack-y use case where you want to release your table reference immediately after reading for return. One obvious way is to have some return binding named "pop" that (1) read the elem, (2) nulled it, (3) decremented the index global variable.

magcius commented 6 years ago

I was imagining this in the context of host-to-wasm calls, or local variables in wasm itself. In some cases, objects passed from host-to-wasm will likely be cleared after the function returns. I was imagining that it would be allocated in "stack storage", and if wasm wants to hold onto it, it would need to copy it to its heap table. The wasm engine itself could have "stack semantics" for certain tables which checks at runtime that there are no holes.

While this is a little different, since it's more related to refcounts, gobject-introspection has transfer annotations where you can specify (transfer none) and (transfer full). "transfer none" means that the ownership is not transferred to the receiver, and if you want to retain it, you need to dupe/refcount it (for instance, static string memory, or an object that shouldn't be freed), and "transfer full" means that it is transferred to the receiver, and that they need to unref/free it on their side.

Perhaps some inspiration can be taken here -- for host-to-wasm cases, "transfer none" arguments would mean that the table slots are automatically set to null upon function return, and "transfer full" arguments would mean that the table slots retain their value. Similar semantics can be imagined for STRING and JSON as well.

lukewagner commented 6 years ago

I expect you're right that a common pattern will be "push the incoming value, pop on return". With the special "pop" mentioned in my previous commented added, I think we'd have the necessary primitives to implement that although maybe if there's a "pop" a "push" would be convenient and symmetric. I think the important thing is that it's fully deterministic where the element goes into the Table and having "push"/"pop" bindings refer to a global variable achieves that.

pchickey commented 5 years ago

Closing as out-of-date: these concepts don't map to the current proposal, which has evolved a lot since this issue was opened.