luwes / js-diff-benchmark

Simple benchmark for testing your DOM diffing algorithm.
66 stars 5 forks source link

Are README numbers updated? #7

Closed WebReflection closed 4 years ago

WebReflection commented 4 years ago

Results seems quite off on my laptop (Dell XPS i7) so I wonder if there's some update needed, as I can see consistently udomdiff faster than others as overall usage.

js-diff-benchmark

luwes commented 4 years ago

it's was kind of head to head on my end, udomdiff was slightly more first but there was also the gzip size.

with your size reduction I'm happy to put you in the first spot! 🎉

thanks for your contributions!

luwes commented 4 years ago

I also normalized the replaceChild operation to be actually 2 operations delete and insert. not sure if this is expected but in the random test udomdiff does seem to have extra operations compared to snabbdom and stage0

Screen Shot 2020-04-15 at 9 22 21 AM
WebReflection commented 4 years ago

not sure if this is expected but in the random test udomdiff does seem to have extra operations compared to snabbdom and stage0

but it's faster, so yes, it's all good to me, the algo is what it is, it's focused on best compromise perf/operations/size, I hope that's OK

luwes commented 4 years ago

definitely, if it's faster and smaller it's A-OK 👌

blankhart commented 4 years ago

udomdiff uses replaceChild in some cases where a node has moved rather than changed. Wouldn't this prevent it from qualifying as reliably "keyed"?

WebReflection commented 4 years ago

@blankhart no, it's reliably keyed, unless you prove it's not, and why that's the case.

Keyed and DOM operations also don't have much to do with each other, and all other algorithms replace too. Inserting does the same too, so it'd be nice if FUD is kept away from this project, thanks.

blankhart commented 4 years ago

Sorry and thanks. I was having an X/Y problem and think my question isn't really about udomdiff, which is definitely "keyed" in the sense @leeoniya described in this issue, i.e. "keyed implementations must not recycle/reuse dom nodes for differing keys (or object identities)."

Before looking at that discussion just now I'd assumed that to be "keyed," when a node moves around within the DOM, it was necessary to tell the runtime to move a DOM node, rather than removing or replacing the node and stitching it back in later. But neither of these approaches would recycle or reuse a DOM node by updating it with new data and so are consistent with being "keyed." (If udomdiff doesn't move nodes in both ways I have misunderstood the algorithm.)

It seemed to me though that those approaches may be distinguishable in practice if some process were observing DOM mutations and reacted differently based on the specific manner in which DOM operations moved a node around (i.e., a single insertBefore that moves a node versus replaceChild and insertBefore to remove a node and reinsert it somewhere else). That was my real question, i.e. whether this difference is important.

The only other algorithms here that I have focused on, list-difference and snabbdom, use replaceChild but they don't use it when moving a node. They use it after checking that the replaced node isn't in the new list and the inserted node wasn't in the old list (e.g., when updating a node in Up10th).

I'll think harder about how best to frame the question as it may be way off base and come back on it in an open issue if it remains a concern. I am a beginner in this area and don't know much yet about MutationObserver or other things that may be relevant. I just posted here in short form because I was away from a computer thinking about the udomdiff anomaly described above (low ms, high ops) and the specific optimization that caused it, which I regard as ingenious.

WebReflection commented 4 years ago

To start with, uhtml is already in that table as both keyed and non-keyed, as it's capable of both, and it's based on udomdiff, so udomdiff definitively works with a keyed approach.

Moreover, this benchmark is only keyed, as each object is kept, moved, or replaced, by identity, no property/id is used to replace anything, only nodes.

On top of that, moving a node in one operation is not even always possible, take the swap row as example:

It doesn't matter which one you replace, one item will eventually be off the tree until it gets re-positioned where the other item was.

(low ms, high ops)

This is not correct. udomdiff does the minimal amount of needed operations almost everywhere, so the high ops is only the random one, but that's because udomdiff doesn't really look for anything, it linearly crawl the source list and work on demand (it streams operations) without striving to find the absolute minimal amount of ops, as it's kinda irrelevant in the real-world, but it grants low ms in most common cases.

Again, I've previously used Levensthein distance and E. Meyer's O(ND) and both were granting perfect amount of minimal operations but were super slow once applied to the DOM.

To recap:

WebReflection commented 4 years ago

Proof

// create two custom elements
document.body.innerHTML = '<a-node>first</a-node><a-node>second</a-node>';

// define a Custom Elements
customElements.define('a-node', class extends HTMLElement {
  connectedCallback() {
    console.log('connected');
  }
  disconnectedCallback() {
    console.log('disconnected');
  }
});

// read connected twice in console ... and ...

// move last child before the first one, without removing it
document.body.insertBefore(
  document.body.lastChild,
  document.body.firstChild
);

// read in console disconnected, followed by connected

See? Moving a node, or removing and inserting it, doesn't make any difference behind the scene: the node is removed and appended, and any mutation observer or custom element would be triggered showing the sequence of removed and added.

blankhart commented 4 years ago

I think this is convincing. Thanks for taking the time to provide such a thoughtful answer. The most interesting thing to me about this is that it both suggests the current benchmarks are measuring a few things in misleading ways, and provides a path to solving those problems.

It's unclear how many DOM operations these algorithms are performing using the current numbers based on calls to the DOM API. Of course, if different calls to the DOM API have different performance impacts even if they produce the same number of DOM mutations, that would be more relevant to the benchmark than counting actual DOM mutations and counting which calls occurred may be informative.

I'll want to understand better for myself why the mutation APIs don't allow a user to observe the difference between a "true" move and a removal/reinsertion, but am persuaded by your answer. Most descriptions of observing a childList referred to observing addition or removal of a node, and it wasn't clear what that really meant in the context where you are telling the runtime that morally the node is "just moving" and the runtime is performing the disconnection/connection operations for you.

WebReflection commented 4 years ago

To be honest, the counting was useful for me while developing udomdiff, and I'm not sure why we give it so much importance in here, as all it matters is the benchmark itself, not really the amount of operations.

However, this is how it works on the DOM:

The appendChild is meaningless for diffing, as operations should be confined in the array of provided nodes, and appendChild might make the container "dirty", because the node might have been appended after other nodes already present in the container, so that is out of the equation.

All differs indeed either do insertBefore, or replaceChild, and removeChild, where only the last one is literally one operation, as it grants the node was live, and it got removed.

Everything else counts as 2 DOM tree mutation events.

Is this relevant? Not sure, the engine might be smart enough to just trigger events without actually needing any repaint/reflow, etc, but this is the status of the DOM when it comes to diffing: every operation costs 1 up to 2, or 3, observable events:

Maybe we should better instrument all callbacks to reflect this counting, and simply show the result or operations, without expectations, as many replace followed by insertBefore might result into 2 operations, when now it's counted as 3 (see high num of opts in udomdiff and the shuffled "deck").

WebReflection commented 4 years ago

@luwes happy to know what you think about changing the counting to reflect what stated above

WebReflection commented 4 years ago

OK, I went ahead and tested this version of the instrument function:

function instrument(parent) {
  const {
    appendChild,
    insertBefore,
    removeChild,
    replaceChild
  } = parent;
  parent.operations = [];
  parent.appendChild = function (newNode) {
    const {textContent} = newNode;
    if (newNode.parentNode)
      this.operations.push(`append: drop(${textContent})`);
    this.operations.push(`append: add(${textContent})`);
    return appendChild.call(this, newNode);
  };
  parent.insertBefore = function (newNode, oldNode) {
    const {textContent} = newNode;
    if (newNode.parentNode)
      this.operations.push(`insert: drop(${textContent})`);
    this.operations.push(
      oldNode ?
        `insert: put(${textContent}) before (${oldNode.textContent})` :
        `insert: add(${textContent})`
    );
    return insertBefore.call(this, newNode, oldNode);
  };
  parent.removeChild = function (oldNode) {
    this.operations.push(`remove: drop(${oldNode.textContent})`);
    return removeChild.call(this, oldNode);
  };
  parent.replaceChild = function (newNode, oldNode) {
    const {textContent} = newNode;
    this.operations.push(`replace: drop(${oldNode.textContent})`);
    if (newNode.parentNode)
      this.operations.push(`replace: drop(${textContent})`);
    this.operations.push(`replace: put(${textContent})`);
    return replaceChild.call(this, newNode, oldNode);
  };
}

The current situation is now this one:

Screenshot from 2020-04-18 16-11-51

This table actually well explains why udomdiff is as fast as others, and it technically performs exactly same amount of DOM mutations, on the shuffle benchmark too.

I think we should go with this version, as it reflects the amount of observable mutations (disconnect and connect) in the real world.

WebReflection commented 4 years ago

Well, @blankhart, Merge Request landed, so thanks for the conversation, it was actually very useful at the end 👍