`DOMChangeList` and `DOMTreeConstruction`

Motivation

There are three high-level motivations for this API:

To make it clearer and less error prone to apply a sequence of DOM operations at once, that can support the full gamut of mutations to Elements and Nodes.
For efficiency, to provide an API for constructing a sequence of DOM operations that can be:
1. constructed in a worker and transferred to the UI thread
2. constructed with a minimum of allocations
3. applied in one shot without interleaved user code
To support, in one API, the superset of trees that can be produced using the HTML parser and using the DOM API (the HTML parser supports more tag names, while the DOM API supports more trees, such as custom elements nested inside a table).

This is an initial, minimal version of this API, which could be expanded over time with more capabilities. It should always maintain its goals of predictable, allocation-lite performance, focusing on giving user-space abstractions the capabilities they need for maximum performance.

Concepts

NodeToken

A NodeToken is an opaque value that represents a Node. It can be efficiently transferred from one worker to another.

It serves two purposes:

To represent a Node that already exists in another worker, and can serve as the starting point of a series of mutations expressed in a DOMChangeList.
To represent an intermediate Node that was produced as part of the process of building a DOMChangeList.
Applying a Change

The intent of this API is to have these performance characteristics:

Creating a change list (DOMTreeConstruction or DOMChangeList) significantly reduces GC-managed allocations compared to the current DOM APIs.
Applying a change list should not create intermediate JavaScript wrappers for the DOM objects it creates, which should reduce costs.
It is possible to run the JavaScript code necessary to construct a change list in a worker, and apply it on the UI thread with minimal additional work or allocations in JavaScript.
The API creates a single immutable blob of instructions to pass to the engine, which is intended to avoid JavaScript side effects while the application proceeds. (as Boris Zbarsky said in his comment about proposals in this space, "that would simplify both specification and implementation (in the sense of not having to spell out a specific processing algorithm, allowing parallelized implementation, etc).")
It is reasonable to assume that the builder APIs could be exposed to WebAssembly.

If there is some reason that a concrete implementation of this design might not be able to accomplish these goals, it's almost certainly something we should discuss.

Details

DOMTreeConstruction

Note: Use of TypeScript generics notation in the details of this proposal does not intend to imply any additional type checking. It is simply a way to annotate which kinds of node a particular node tokens is intended to produce for clarity. This proposal expects checking at the point of application (e.g. appending a node to a non-element node doesn't make sense and would be an error at the time of application). At present, this proposal does not assume that errors in the change list cause previously applied operations to be rolled back; that is listed as an open question below.

// you can create a NodeToken in the UI thread and transfer it into
// a worker that constructs the DOMChangeList or DOMTreeConstruction.
partial interface Node {
  getToken(): NodeToken<this>;  
}

interface Bounds {
  firstNode: NodeToken<Node>;
  lastNode: NodeToken<Node>;
}

partial interface Element {
  // this has the same semantics as insertBefore, but takes a `DOMTreeConstruction`
  insertTreeBefore(element: Element, tree: DOMTreeConstruction, reference: Node): Promise<AppliedChanges>;
}

// The intent of this object is to allow engines to clean up any bookkeeping maps they
// need to track `NodeToken`s once the `AppliedChanges` is GCed.
interface AppliedChanges {
  // this allows code that created a ChangeList to get the actual
  // node they were creating so they can add listeners or change
  // the node later.
  getNode<T extends Node>(t: NodeToken<T>): T;
}

// DOMTreeConstruction is transferrable, and is backed by raw bytes. This
// allows trees to be constructed in workers.
interface DOMTreeConstruction {
  openElement(name: string, ns: string = "html"): NodeToken<Element>;
  closeElement();

  setAttribute(name: string, value: string = "", ns: string = ""): void;
  appendText(text: string): NodeToken<Text>;
  appendComment(text: string): NodeToken<Comment>;
  appendHTML(html: string): Bounds;
}

DOMChangeList

partial interface Document {
  applyChanges(changes: DOMChangeList): Promise<AppliedChanges>;
}

// DOMChangeList is transferrable, and is backed by raw bytes. This
// allows change lists to be constructed in workers.
interface DOMChangeList {
  /// traverse
  nextSibling(n: NodeToken<Node>): NodeToken<Node | null>;
  previousSibling(n: NodeToken<Node>): NodeToken<Node | null>;
  firstChild(n: NodeToken<Node>): NodeToken<Node | null>;
  lastChild(n: NodeToken<Node>): NodeToken<Node | null>;
  parentNode(n: NodeToken<Node>): NodeToken<Element | null>;

  /// create
  createElement(name: string, ns: string = 'html'): NodeToken<Element>;
  createText(text: string): NodeToken<Text>;
  createComment(text: string): NodeToken<Comment>;

  /// update
  setAttribute(el: NodeToken<Element>, name: string, value: string = '', ns: string = ''): void;
  updateTextData(n: NodeToken<Text>, text: string): void;
  updateCommentData(n: NodeToken<Comment>, text: string): void;
  remove(n: NodeToken<Node>): void;

  /// append
  appendChild(el: NodeToken<Element>, node: NodeToken<Node>): void;
  appendTree(el: NodeToken<Element>, tree: DOMTreeConstruction): void;

  // this unifies the various HTML-based APIs in the DOM; the return value is
  // a token corresponding to the start and a token corresponding to the end of
  // the inserted tree which can be reified after application.
  //
  // IMPORTANT NOTE: The HTML is parsed during application, not when
  // it is added to the change list.
  insertHTMLBefore(node: NodeToken<Node>, html: string): [NodeToken, NodeToken];
}

Example

Using the usual DOM API:

let article = document.getElementById('main-article');
let text = node.firstChild;
text.textValue = "Hello world";
node.setAttribute("data-active", "yes");
node.removeAttribute("data-pending");

Using the ChangeList API:

let articleToken = document.getToken(document.getElementById('main-article'));

let changes = new DOMChangeList();
let textToken = changes.firstChild(token);
changes.updateTextData(textToken, "Hello world");
changes.setAttribute(articleToken, "data-active", "yes");
changes.removeAttribute(articleToken, "data-pending");

document.applyChanges(changes);

FAQ

Is this actually faster?

Don't you still have to create all of the DOM nodes anyway? If so, why is it cheaper?
Isn't DOM pretty fast in modern browsers already?

The intent of this API is to create a low-level interface that is as close as possible to the underlying implementations. It attempts to avoid introducing new costs while reducing a number of paper-cuts that exist in today's usage.

This API creates DOM nodes in the engine, but it does not need to create JavaScript wrappers. Experiments with deep cloneNode show that skipping those wrappers provides a performance benefit, but cloneNode() can't satisfy as many use-cases as this API.
It allows the construction of the set of mutations to occur separately from the application (or even in a worker), keeping the sensitive work that limits 60fps to a minimum.
Since applying changes is asynchronous, the full change list can be applied in batches that avoid blocking interaction (especially scroll). If the browser reaches its budget, it can interleave some work to keep the UI interactive and pick up the mutation process afterward. In short, the API should allow browsers to experiment with more scheduling strategies.
It encourages good staging practices, eliminating some of the major causes of layout thrash (see the next section).
Isn't the real issue that people are interleaving DOM manipulation and layout?

That is certainly a major issue, and this API puts developers on the path to success by encouraging them to stage DOM manipulation work separately from APIs that can trigger painting or layout.

Because the API guarantees that no user script can interleave during the application of changes, there is no way to "mess up" and trigger an immediate flush of any deferred work.

Unresolved Questions

The current state of this API allows failures to occur during the processing of a change list, and does not require engines to roll back earlier changes (rolling back changes like "remove an iframe" may not be trivial to implement). Would engines prefer to roll back changes?
Should we support APIs like ClassList and the style property through this API? It may be difficult to represent these kinds of changes with the operations already proposed (since this API does not allow direct imperative access to the DOM), and a few additional APIs probably wouldn't do damage to the constraints.
Are there other optimizations that could be performed by engines? For example, Luke Wagner suggested that a parameterized version of this API could work like "prepared statements" in SQL, allowing engines to do up-front work to optimize the access and mutation patterns for application. What can we do to make this API more hospitable to hypothetical optimizations like those?

@wycats To be specific, one thing that I'm really interested in, which this proposal does not cover, is event handling and I could really use your input on that if you have any thoughts.

@Gozala what's the easiest way to get in touch?

@wycats I've being noticing you at MozPDX fairly often, maybe next time you're there we can chat ? Alternatively IRC, you can find me as @gozala on both freenode and moz.

I looked into the concerns around NodeToken being an integer (or number). For the DOMTreeConstruction idea it can be an integer. The integers get returned as you use DOMTreeConstruction. The moment you turn those integers into nodes using getNode(), there's a reference to DOMTreeConstruction available and any lookup only happens within that object. An implementation could store a map of integers to nodes on DOMTreeConstruction the moment nodes are created as part of insertTreeBefore(). As these integers are therefore entirely self-contained, this even paves the path toward DOMTreeConstruction being serializable and network-transferable. To ensure we don't have to keep that map alive forever we could deterministically empty it by .then()'ing the returned promise from insertTreeBefore(). Those details can be fiddled with, this just shows it's viable.

DOMChangeList is trickier as the integers can also represent nodes in an existing tree (via getToken(), which is listed under DOMTreeConstruction in OP, but is not part of it as no method takes a NodeToken there), in a different thread. That means the integers for DOMChangeList need to be unique within an agent cluster and are probably not suitable for network-transfer, ever. It also means they need to be long-lived as they need to be able to transfer across threads. The fogeability problem I would address by when looking up the node corresponding to an integer, also ensuring that its node document is identical to the document upon which applyChanges() was invoked. And as said before for any instructions that are incorrect we'd abort, leaving you with a potentially broken tree as we cannot do transactions (see upthread).

(Given the nature of integers in DOMChangeList I suspect there would need to be some requirement for them to be random to avoid fingerprinting.)

So in conclusion I think both designs are viable.

I think it's worth pointing out this experimental library I just put together: https://github.com/AshleyScirra/via.js

This allows full access to all DOM APIs using Proxys. It effectively builds an internal list of DOM commands and posts them to the main thread to be carried out. It happens entirely in JS land. I think its main advantage over DOMChangeList is it allows you to use all APIs the same way you do in the DOM, rather than a limited tree-construction subset with its own API. For example you can do div.style.fontWeight = "bold" from a worker, or make use of mainthread-only APIs like the Web Audio API from a worker. Also the performance is actually kind of reasonable (it's not as terrible as I thought it might be, at least).

The main problem is that it leaks memory. GC is not observable so it's not possible to identify when the placeholder Proxy objects on the worker get collected and drop their corresponding references on the main thread. I think this can be solved with a special WeakKey object which still doesn't make GC observable: https://discourse.wicg.io/t/a-way-to-the-use-dom-in-a-web-worker/2483/2?u=ashleyscirra

If I read this proposal right, it sounds like this WeakKey idea is like NodeToken but generalised to represent any DOM object at all. This significantly broadens the use cases it covers. For example we want to move our entire HTML5 game engine in to a worker, but the lack of APIs like Web Audio is a much bigger problem than whether or not we can create nodes. Via.js lets you do both: build up lists of DOM commands, and access arbitrary APIs only present on the main thread.

Could I get some sort of status update on this concept in general? (Evolutions, meeting notes, etc.?) Haven't really heard or seen anything about it anywhere, either here or on the WICG discourse. (I'm not active on any of the W3C/WHATWG mailing lists, though.)

There's no active mailing list. There was a meeting at a Mozilla All Hands, but that wasn't really conclusive one way or another. The main problem here is lack of implementer interest.

I wrote a blog post that covers some of the options to better support implementing this entirely in a library: https://www.scirra.com/blog/ashley/38/why-javascript-needs-cross-heap-collection

@annevk Sorry, I meant any of the W3C/WHATWG mailing lists. (I edited my comment accordingly.)

Also, I'm not a huge fan of the API proposed - it's a bit too verbose and too much of a shotgun for something that's really just performing transactions on nodes. I'd personally prefer there was a way to just force browsers to batch something without having changes exposed to the main thread until the next animation frame, and a way to run changes in the layout pass to reduce the impact of style recalcs (which account for 90% of the perf issues with DOM frameworks).

Edit: Hidden for easier scrolling

Here's my thought on such an API, one that's a bit less ambitious: ```webidl // `execute` runs after animation frames, after all previous transactions have been // executed, but before intersection observers have been run. Any transactions started // here are deferred to the *next* animation frame. // // When `execute` called, if any properties of a node all of these are true, then an // `InvalidStateError` should be thrown: // 1. The node being mutated is in the calling document. // 2. The node is live, or would have been made live in the transaction. // 3. UI-visible attributes of that node are read/modified, including ones like // `parentNode` and even if indirectly (like in a transaction). // 4. The node *is not* in `read` (if read) or `write` (if modified). dictionary LayoutOptions { AbortSignal? signal; sequence read = []; sequence write = []; // I know this is invalid syntax, but `this` is supposed to be the layout options object Promise? execute(); } // Transactions assume the state right after the animation frames are run *are* the // initial state - they don't actually save anything first, and they act as if the // operations work on the state after animation frame callbacks run. [Exposed=Window, Constructor] interface DOMTransaction { void insertBefore(ParentNode node, ChildNode? node, ChildNode next); void insertAfter(ParentNode node, ChildNode? node, ChildNode prev); void replace(ChildNode node, ChildNode? other); void setAttribute(Element elem, DOMString key, DOMString? value, DOMString? ns); void toggleClass(Element elem, DOMString classes, boolean force); void setStyle(HTMLElement elem, DOMString key, DOMString? value); void setNodeValue(Node node, DOMString value); // The tree is locked between when animation frames and intersection observers // would be run. The fulfillment value is that of `callback.execute()` or // `undefined` if the method wasn't passed. Promise end(optional LayoutOptions? options); } ``` There's a few specific omissions and design points with my proposal here that I thought I'd note: 1. Most DOM interactions can be reduced to these. - Append: `transaction.insertAfter(parent, null, node)` - Prepend: `transaction.insertBefore(parent, null, node)` - Insert before: `transaction.insertBefore(parent, next, node)` - Insert after: `transaction.insertAfter(parent, prev, node)` - Replace: `transaction.replace(node, other)` - Remove: `transaction.replace(node, null)` - Set attribute: `transaction.setAttribute(node, key, value, ns)` - Remove attribute: `transaction.setAttribute((node, key, null, ns)` - Set class: `transaction.toggleClass(node, classes, true)` - Remove class: `transaction.toggleClass(node, classes, false)` - Set node value: `transaction.setNodeValue(text, value)` - Set style: `transaction.setStyle(elem, key, value)` - Remove style: `transaction.setStyle(node, key, null)` - Set `id`/`className`/etc.: set that attribute 1. The few that can't could be handled in the `LayoutCallback` without having to spend an extra animation frame. (It also exists for the sake of refs, like what most vdom frameworks have some variant of.) 1. I chose to include both, and only, `insertBefore` and `insertAfter` as it's much simpler than the half dozen existing methods, and it works better with single-pass updates. 1. I chose to *exclude* properties that aren't available via normal DOM attributes because there's generally nothing left to do. 1. This is *technically* polyfillable without an explosion of code short of the throwing errors part. It would require wrapping `requestAnimationFrame` and `cancelAnimationFrame` to call the callbacks at the right time, but that's about it. 1. The reason I require explicit lists is so browsers can lock the others in that document to update them in parallel and be able to compute their layout without having to assume the prospect of interference with other code. (In practice, they only need *one* read/write lock to rule them all per document, and after that, a quick flag check whose value is initially calculated after animation frame callbacks are run to assert that the node is destined to become live.) 1. The choice of whether to fire an extra callback is deferred until the end because you might find while patching that you might not need to fire that layout callback (which would be potentially pretty expensive). If no layout callbacks are scheduled, the browser can compute layout immediately and exercise its normal magic, all off-thread. 1. The `DOMTreeConstruction` API is about 90% solved by `innerHTML` with `

whatwg / dom

Proposal: DOMChangeList #270

`DOMChangeList` and `DOMTreeConstruction`

Motivation

Concepts

NodeToken

Applying a Change

Details

DOMTreeConstruction

DOMChangeList

Example

FAQ

Is this actually faster?

Isn't the real issue that people are interleaving DOM manipulation and layout?

Unresolved Questions

whatwg / dom

Proposal: DOMChangeList #270

DOMChangeList and DOMTreeConstruction

Motivation

Concepts

NodeToken

Applying a Change

Details

DOMTreeConstruction

DOMChangeList

Example

FAQ

Is this actually faster?

Isn't the real issue that people are interleaving DOM manipulation and layout?

Unresolved Questions

`DOMChangeList` and `DOMTreeConstruction`