1904labs / dom-to-image-more

Generates an image from a DOM node using HTML5 canvas
Other
486 stars 104 forks source link

Add `onclone` size optimization for inline styles #92

Open zm-cttae opened 1 year ago

zm-cttae commented 1 year ago

Use case: description, code

In an ideal world we want to optimize SVG output by removing styles that can be inherited.

This mainly has an impact at >>1K nodes where the output is in the 1MB scale of magnitude.

There's not much of a story here at the moment, but the library did previously lean on unset until the #90 regression:

https://github.com/1904labs/dom-to-image-more/blob/2c54d7f87c3bb2a9f16223d225f8f1c0c20e3ac6/src/dom-to-image-more.js#L915-L934

There's two strategies I tried to research:

Some starter code, but this code snippet:

  1. doesn't restore CSS props in place, making it indeterministic and liable to break styling
  2. doesn't go in ascending order up the DOM tree, thus removing inherited properties that are necessary
    
    const treeWalker = document.createTreeWalker(document.querySelector('foreignObject > *'), NodeFilter.SHOW_ELEMENT)
    const elementList = [];
    let currentNode = treeWalker.currentNode;

while (currentNode) { elementList.push(currentNode); currentNode = treeWalker.nextNode(); }

elementList.forEach(function(element) { const inlineStyles = element.style; const computedStyles = getComputedStyle(element); util.asArray(inlineStyles).forEach(function(name) { if (inlineStyles.cssText.includes(name + ': ' + value + ';')) { const value = inlineStyles.getPropertyValue(name); inlineStyles.removeProperty(name); if (value !== computedStyles.getPropertyValue(name)) { inlineStyles[name] = value; } } }); });

IDisposable commented 1 year ago

I'm really steering away from this change right now. Really do NOT want to delve into the very fragile territory that it brings trying to white-list things. I will add a override path for the SVG path to the impl.utils so you can monkey-patch in something to build upon while we figure things out...

zm-cttae commented 1 year ago

Sounds all right. Happy hols! I imagine the impl.utils solution will keep all happy.


Wrote filterWinningInlineStyles algorithm that's "good enough", for any interested. Expand if you want to see a small wall of code. This code expects the iframe/SVG window to be of the same width as the original document. ## Optimizations 1. **When traversing DOM tree of `node`, group nodes by descending node depth.** CSS inheritance is computed on the DOM tree via preorder traversal and is additive-cumulative (increases styling data). For the filter op which is subtractive, we want to traverse the tree in the opposite direction. The algorithm sorts elements in the `node` tree by descending node depth. (This is known as reverse level order traversal.) This gives us a 30% to 40% speed boost. This also ensures declarations are only removed when they really can be inherited. 2. **When filtering each inline style declaration by computed effect, go for the most hyphenated properties first.** In CSS, shorthands consistently have less hyphens than their longhand. We want to filter out scenarios where a CSS property matches their shorthand, e.g. `block-size` -> `height` or `border-color` -> `border`. The algorithm does a radix sort with bitmasks for standard, custom and vendored proprties, then subsorts descending hyphen count. In tests this filtered another 50% of inline styling. We also get a 20-40% speed boost because we're not setting as many properties back. ```javascript /* eslint no-implicit-globals: "error" */ (function(global) { const contentElement = global.document.body || global.document.documentElement; const cssBlockCommentRegex = /\/\*[^*]*\*+([^/*][^*]*\*+)*\//g; const cssDeclarationColonRegex = /;\s*(?=-*\w+(?:-\w+)*:\s*(?:[^"']*["'][^"']*["'])*[^"']*$)/g; /** * Filter inline style declarations for a DOM element tree by computed effect. * Estimated inline style reduction at 80% to 90%. * * @param {HTMLElement} clone * HTML clone with styling from inline attributes and embedded stylesheets only. * Expects fonts and images to have been previously embedded into the page. * @returns {Promise} * A promise that resolves to the `clone` reference, now stripped of inline styling * declarations without a computed effect. */ global.dominlinestylefilter = function(clone) { const context = new Context(clone); return new Promise(stageCloneWith(context)) .then(collectTree) .then(sortAscending) .then(multiPassFilter) .then(unstageClone); }; /** * Synchronous version of {@link onclone}. * @param {HTMLElement} clone * @returns {HTMLElement} */ dominlinestylefilter.sync = function(clone) { let context = new Context(clone); try { let value = execute(stageCloneWith(context)); [collectTree, sortAscending, multiPassFilter, unstageClone] .forEach(function(fn) { value = fn(value) }); return value; } catch(e) { unstageClone(context); throw e; } }; /** * Process context to propogate in promise chain. * @param {HTMLElement} clone * Node with all computed styles dumped in the inline styling. * @constructor */ function Context(clone) { this.root = clone; this.sibling = clone.nextSibling; this.parent = clone.parentElement; this.sandbox = null; this.self = null; this.tree = null; this.pyramid = null; this.delta = null; } /** * Styling data for a HTML element. * @param {HTMLElement} element Element in the DOM tree of clone. * @param {Context} context * @constructor */ function Styles(element, context) { this.inline = element.style; this.computed = context.sandbox.contentWindow.getComputedStyle(element); } /** * Promise executor function. * @typedef {(resolve: (value: Context) => void, reject: (reason?: string) => void) => void} Executor */ /** * Synchronously execute a promise executor function. * @param {Executor} executor */ function execute(executor) { let result; const resolver = (value) => { result = value; }; const rejector = (reason) => { throw new Error(reason); }; executor(resolver, rejector); return result; } /** * Creates a hidden, rendered sandbox