luwes / js-diff-benchmark

Simple benchmark for testing your DOM diffing algorithm.
66 stars 5 forks source link

Fix Paul Heckel algorithm #1

Open luwes opened 4 years ago

luwes commented 4 years ago

The number of operations is way too big.

Seems the script used in the benchmark is broke. https://johnresig.com/projects/javascript-diff-algorithm/

cc @jeresig

luwes commented 4 years ago

links to useful docs:

WebReflection commented 4 years ago

not sure if it helps, but I've described the pitfall of string based algorithms in here ... the TL;DR version is:

string based diffing algorithms don't have DOM constrains (a node cannot exists simultaneously in two lists or places) ... so anything based on strings algorithms, once applied to DOM, inevitably need to adjust their logic to deal with already live, or moved, nodes.

In few words, I don't think Paul Heckel algorithm is suitable for this task, performance wise, but maybe there's room for improvements.

WebReflection commented 4 years ago

P.S. if interested, this is an E.Meyers's O(ND) Based Diffing Algorithm library with a DOM diffing based example included.

Performance are great, for a general purpose diffing library, yet far away from udomdiff, and this is part of my 3 years old research on this topic ... I think I've tried 'em all.

luwes commented 4 years ago

what I got from researching it, Heckel doesn't work if it has duplicates in one of the lists which is not the case for DOM nodes. there's a bunch of Swift implementations out there based on Heckel.

https://ra1028.github.io/DifferenceKit/

this caught my attention

Screen Shot 2020-04-15 at 12 00 21 PM
luwes commented 4 years ago

string based diffing algorithms don't have DOM constrains (a node cannot exists simultaneously in two lists or places) ... so anything based on strings algorithms, once applied to DOM, inevitably needs to adjust their logic to deal with already live, or moved, nodes.

not fully getting this. dom nodes vs unique strings in the A and B lists are comparable no? it's just inserts and deletes

WebReflection commented 4 years ago

Heckel doesn't work if it has duplicates in one of the lists which is not the case for DOM nodes.

not sure I get you right here, but a and b can contain the same nodes, hence you have "duplicates" in those lists, and adjustments are needed to avoid interfering with what was live already and will still be live maybe somewhere else.

WebReflection commented 4 years ago

dom nodes vs unique strings in the A and B lists are comparable no?

the issue is that the moment you flag a letter/sequence for insertion into b, the same letter/sequence could've been present somewhere else in a (source, live) so that b the list in a already changed as side-effect.

const live = ['b', 'd', 'a'];
const next = ['a', 'b', 'c'];

the string way: insert a before b, remove d, add c, ignore or remove a the DOM way: prepend a before b, replace d with c

the DOM needs linearly 2 operations instead of 3, 'cause a would virtually disappear from the equation once inserted. Not sure I've explained it properly, but basically string don't mutate while creating the diff, the DOM does, and udomdiff tries to take this into account, string based algos don't consider this difference, as strings are immutable.

WebReflection commented 4 years ago

apologies, badly worded ... I guess the quick impression is that P. Heckel is fine, but it can't perform much better than DOM oriented diffing strategies.

dy commented 4 years ago

Please consider that Heckel intensely uses nextSibling, and the way that is implemented in Dommy is slow: https://github.com/luwes/js-diff-benchmark/blob/312ab02f7c36f68d57f5ba0ebea16ab6068b70d8/src/dommy.js#L92-L96

Browsers don't do search and instead keep nextSibling/previousSibling references directly. Fixed DOM engine would also let reduging spect to 272b, as it is done here.

WebReflection commented 4 years ago

@dy fair enough, but every other library that uses nextSibling is similarly affected, right?

luwes commented 4 years ago

good point @dy and @WebReflection

besides that I think the Heckel script is still not working like it should. it's a bad implementation, if I have some more time I'm gonna try and create a correct version. kinda curious how this performs.

but maybe yes making that a linked list or so would be more accurate. happy to review any PR's ;)

luwes commented 4 years ago

272B is damn small! 🦾