mozilla / price-tracker

Price Tracker is a Firefox extension that spots price drops on things you’re interested in.
https://addons.mozilla.org/en-US/firefox/addon/price-tracker/
Mozilla Public License 2.0
53 stars 17 forks source link

Make Fathom extraction more performant #319

Open biancadanforth opened 5 years ago

biancadanforth commented 5 years ago

TL;DR: isVisible, which will likely be a common rule in many Fathom applications, accounts for the majority of the 460 ms of Fathom-related jank (67%). It may be possible to reduce this overall Fathom jank by as much as 374 ms (81%) by reducing style and DOM property accesses in isVisible, but this solution requires a privileged context.

Is this a feature request or a bug?

Feature request

What is the current behavior?

Fathom currently causes considerable jank on page load. Given that Price Tracker was an experiment, performance was not a major consideration in development of the current ruleset used to extract a product from a page.

Initial profile

Initial profile[1]

Numbers

Screen Shot 2019-07-16 at 8 33 26 PM

isVisible(fnode) {
  for (const ancestor of ancestors(fnode.element)) {
    const style = getComputedStyle(ancestor);
    const isElementHidden = (
      style.visibility === 'hidden'
      || style.display === 'none'
      || style.opacity === '0'
      || style.width === '0'
      || style.height === '0'
    );
    if (isElementHidden) {
      return false;
    }
  }
  return true;
}
/**
 * Yield an element and each of its ancestors.
 */
export function *ancestors(element) {
    yield element;
    let parent;
    while ((parent = element.parentNode) !== null && parent.nodeType === parent.ELEMENT_NODE) {
        yield parent;
        element = parent;
    }
}

What is the expected or desired behavior?

Fathom should be able to run much faster with much less jank.

I shared this profile with some Firefox engineers, including a layout (Emilio) and a performance engineer (Greg).

Why is isVisible taking so much time?

Updating isVisible: first attempt

Revised profile: first attempt[1]

This focuses on the fix of using the cheapest style accesses first (e.g. getBoundingClientRect) and early return when possible in isVisible.

Numbers

// Avoid reading `display: none` due to Bug 1381071
isVisible(fnode) {
  const element = fnode.element;
  if (element.getBoxQuads().length === 0) {
    // The element has no box (display: none subtree?), checking
    // getBoundingClientRect instead for a width and height of 0 only tells
    // us it is a zero-sized box.
    return false;
  }
  const eleStyle = getComputedStyle(element);
  if (eleStyle.visibility === 'hidden') {
    return false; // not painted.
  }
  // This is assuming inclusive ancestors, otherwise we need to check opacity
  // above too.
  for (const ancestor of ancestors(element)) {
    const style = getComputedStyle(ancestor);
    if (style.opacity === '0') {
      return false;
    }
    if (style.display === 'contents') {
      // display: contents elements have no box themselves, but children are
      // still rendered.
      continue;
    }
    const rect = ancestor.getBoundingClientRect();
    if ((rect.width === 0 || rect.height === 0) && style.overflow === 'hidden') {
      // This is a zero-sized box; zero-sized ancestors don’t make descendants
      // hidden unless the descendant has overflow: hidden
      return false;
    }
  }
  return true;
}

Conclusions

Updating isVisible: second attempt

Revised profile: second attempt[1]

This focuses on the fix of mitigating (actually, eliminating) redundant work, removing ancestors entirely to eliminate DOM accesses, and reducing the number of style accesses by checking if the element is clickable.

Numbers

  isVisible(fnode) {
    const element = fnode.element;
    const rect = element.getBoundingClientRect();
    return !!(rect.height && rect.width && (document.elementsFromPoint(rect.x
    + rect.width / 2, rect.y + rect.height / 2)).includes(element));
  }

Conclusions

Version information

[1]: Profiled on a locally hosted product page with specs from the "Version information" section in this comment.

biancadanforth commented 5 years ago

Note: As is, isVisible has a bug, since getComputedStyle().width and getComputedStyle().height return values like "0px" and such, not "0" (thanks Emilio).

Also, it may include elements for consideration that are offscreen and therefore not practically visible to the user. Fathom's isVisible function handles this case, though it suffers from the same error as above since it originated from Price Tracker's implementation.

Fathom v3.1.0 isVisible implementation

/**
 * Return whether an element is practically visible, considing things like 0
 * size or opacity, ``display: none``, and ``visibility: hidden``.
 */
export function isVisible(fnodeOrElement) {
    const element = toDomElement(fnodeOrElement);
    for (const ancestor of ancestors(element)) {
        const style = getComputedStyle(ancestor);
        if (style.visibility === 'hidden' ||
            style.display === 'none' ||
            style.opacity === '0' ||
            style.width === '0' ||
            style.height === '0') {
            return false;
        } else {
            // It wasn't hidden based on a computed style. See if it's
            // offscreen:
            const rect = element.getBoundingClientRect();
            const frame = element.ownerDocument.defaultView;  // window or iframe
            if ((rect.right + frame.scrollX < 0) ||
                (rect.bottom + frame.scrollY < 0)) {
                return false;
            }
        }
    }
    return true;
}

Probably what this issue boils down to is to use the isVisible function in the Fathom library, once that has been updated with a more performant and complete version.

biancadanforth commented 5 years ago

Updating isVisible: third attempt

(continued from the first comment in this thread)

Revised profile: third attempt[1]

This implementation of isVisible focuses on the fixes of leveraging natural flushes by using the Intersection Observer API and caching results.

Numbers

const visibleElements = new Set(); // If an element belongs to this set, it's visible

function isVisiblePreProcessing() {
  const observers = new Map(); // element => IntersectionObserver instance
  const observerOptions = {
    root: document.body,
    threshold: 1.0,
  };
  function intersectionCallback(entries) {
    entries.forEach((entry) => {
      const ele = entry.target;
      if (entry.isIntersecting) {
        visibleElements.add(ele);
      }
      const observer = observers.get(ele);
      observer.unobserve(ele);
      observers.delete(ele);
    });
  }

  const root = document.body;
  const walker = document.createTreeWalker(root, NodeFilter.SHOW_ELEMENT, null, false);
  let {currentNode} = walker;
  while (currentNode) {
    const {tagName} = currentNode;
    if (tagName === 'BODY' || tagName === 'SCRIPT') {
      currentNode = walker.nextNode();
      continue;
    }
    const observer = new IntersectionObserver(intersectionCallback, observerOptions);
    observers.set(currentNode, observer);
    observer.observe(currentNode);
    currentNode = walker.nextNode();
  }
}

isVisiblePreProcessing();

// ...

isVisible(fnode) {
    return visibleElements.has(fnode.element);
}

Conclusions

[1]: Profiled on a locally hosted product page with specs from the "Version information" section in this comment.

mikeconley commented 5 years ago

Hi @biancadanforth,

I took a read through this and a quick peek at the Price Tracker code. A few questions:

  1. Right now, it looks like the extraction script runs at document_end. Is that strictly necessary? If you use document_idle instead, you can queue the work to occur after the pageload has completed, so as to not impact it.

  2. If you've already tried this, please ignore, but if not, this might be worth considering:

It's possible to decrease the likelihood of causing synchronous layout flushes, and making getBoundingClientRect calls cheaper, by (as you've noted) waiting until a "natural" layout flush. promiseDocumentFlushed is not available to you, but you can kind of emulate it via:

requestAnimationFrame(() => {
  setTimeout(() => {
    // This code will be run ASAP after Style and Layout information have
    // been calculated and the paint has occurred. Unless something else
    // has dirtied the DOM very early, querying for style and layout information
    // here should be cheap.
  }, 0);
});

We're still going to get burned if the page somehow runs any JS that dirties the DOM before the setTimeout runs, but if it hasn't, then calls to getBoundingClientRect inside of that setTimeout should be cheap.

Assuming you can get pretty cheap calls to getBoundingClientRect(), you might be able to use those values plus the scroll offset of the window to calculate if an item is collapsed (0x0 dimensions - this will also cover display: none), invisible (visibility: hidden, or opacity: 0), or off screen (x and y values along with width and height place it outside the display port).

If that's still not cheap enough, then perhaps we could expose some kind of special power to your WebExtension content script to get the information it needs more directly from Layout - though we'd probably want to confer with the Layout team to see how easy that is.

  1. The other thing to consider is that, sometimes things are just unavoidably expensive, and there's not much we can practically do about the overall cost. When that occurs, the next tactic is trying to figure out if you can break up the work over a larger chunk of time. I'm not sure if Fathom is compatible with this idea, but you could use something like IdleCallbacks. Check out this MDN article on using them - but the general idea is that you queue up functions to run during "free idle periods", and you have a time budget you can use, and you yield back when you've reached your budget. You get to continue during the next idle slice.

This can add some interesting bugs where the document state might change underneath you during your scan, but if you're willing to live with those bugs, this might be a fallback plan worth considering.

biancadanforth commented 5 years ago

Thank you very much @mikeconley for taking the time to look at this and offer your thoughts.

Right now, it looks like the extraction script runs at document_end. Is that strictly necessary? If you use document_idle instead, you can queue the work to occur after the pageload has completed, so as to not impact it.

This is a good observation. Certainly if we waited to run Fathom (i.e. isVisible and friends) during idle time, we should see less jank at page load, and we do, though its relative share of the jank is a tenth of that contributed by isVisible.[1]

Originally, the decision to run this at document_end was the fact that we wanted to extract the information from the page as soon as possible, and we found that this was the earliest time that we could get the information successfully in many cases.

While in the case of the Price Tracker application of Fathom, we probably could wait until document_idle to extract, there are some applications like popup blocking, login form identification for the password manager, etc. where waiting for the page to finish loading can make the difference between the feature being useful and not.

Additionally, when this extraction code is run is not handled directly by Fathom but by the application author, as how that is set up depends on the application context. While we can certainly make recommendations, I am especially interested in where we can make performance gains (and to maximal effect) in the Fathom code itself (such as Fathom's isVisible method).

It's possible to decrease the likelihood of causing synchronous layout flushes, and making getBoundingClientRect calls cheaper, by (as you've noted) waiting until a "natural" layout flush. promiseDocumentFlushed is not available to you, but you can kind of emulate it via: [requestAnimationFrame with setTimeout]

It took me a while to get this approach working, as requestAnimationFrame is async, and Fathom is sync. I'm going to go ahead and format this in the same way I did my previous approaches above, so please excuse all the styling.

Updating isVisible: fourth attempt - requestAnimationFrame

(continued from the first comment in this thread)

What I ended up doing was adding a one-time preprocessing step that is awaited before Fathom is run. This pre-processing step walks the DOM and checks each node of interest using the original isVisible implementation at this opportune time using a setTimeout inside a requestAnimationFrame. It required a decent amount of rewiring in Price Tracker; the diff (branched from PR #317) is below.[2]

Revised profile: fourth attempt using requestAnimationFrame[3]

Numbers

Conclusions

If that's still not cheap enough, then perhaps we could expose some kind of special power to your WebExtension content script to get the information it needs more directly from Layout

How do I know what’s cheap enough? What do you mean by special powers? I’m aware of that meaning in mochitests only heh.

The other thing to consider is that, sometimes things are just unavoidably expensive, and there's not much we can practically do about the overall cost. When that occurs, the next tactic is trying to figure out if you can break up the work over a larger chunk of time.

There are a couple of challenges that I see with a concurrency approach. The first is we’d have to rewrite Fathom to be async, as it is sync right now. The second is that, depending on how much work needs to be done, the wall clock time could be the difference between Fathom completing fast enough for the results to be useful in some applications (as I mentioned earlier, e.g. popup blocking). It’s not completely off the table, but it does seem like a heavier lift. I'll check out that link, thanks.

[1]: I profiled the original isVisible implementation running in the content script loaded at document_idle instead of document_end. I still see 430 ms of Fathom-related jank compared to the 460 ms. This means that that one change alone, at least for the sample page I’ve been running on, would reduce Fathom-related jank by 6.5%. By comparison, in the same profile, isVisible contributes 327 ms or 76% of Fathom-related jank. So while that change would certainly help, it’s far from the biggest offender, and as I mention elsewhere, waiting until page load could make the difference in usability for certain applications.

[2]:

diff --git a/src/extraction/fathom/index.js b/src/extraction/fathom/index.js
index be9ec03..acf60d8 100644
--- a/src/extraction/fathom/index.js
+++ b/src/extraction/fathom/index.js
@@ -17,15 +17,6 @@ import coefficients from 'commerce/extraction/fathom/coefficients.json';

 // Minimum score to be considered the "correct" feature element extracted by Fathom
 const SCORE_THRESHOLD = 0.5;
-// For production, we don't need to generate a new ruleset factory
-// and ruleset every time we run Fathom, since the coefficients are static.
-const rulesetFactory = new RulesetFactory();
-const rules = rulesetFactory.makeRuleset([
-  ...coefficients.image,
-  ...coefficients.title,
-  ...coefficients.price,
-],
-biases);

 /** How product information is extracted depends on the feature */
 const FEATURE_DEFAULTS = {
@@ -71,7 +62,7 @@ const PRODUCT_FEATURES = {
  * Extracts the highest scoring element above a score threshold
  * contained in a page's HTML document.
  */
-function runRuleset(doc) {
+function runRuleset(doc, rules) {
   const extractedElements = {};
   const results = rules.against(doc);
   for (const feature of Object.keys(PRODUCT_FEATURES)) {
@@ -95,13 +86,28 @@ function hasAllFeatures(obj) {
 /*
  * Run the ruleset for the product features against the current window document
  */
-export default function extractProduct(doc) {
-  const extractedProduct = {};
-  const extractedElements = runRuleset(doc);
-  if (hasAllFeatures(extractedElements)) {
-    for (const [feature, methods] of Object.entries(PRODUCT_FEATURES)) {
-      extractedProduct[feature] = methods.getValueFromElement(extractedElements[feature]);
+export default async function extractProductFactory() {
+  const rulesetFactory = await RulesetFactory.isVisiblePreProcessing();
+
+  // For production, we don't need to generate a new ruleset factory
+  // and ruleset every time we run Fathom, since the coefficients are static.
+  const rules = rulesetFactory.makeRuleset([
+    ...coefficients.image,
+    ...coefficients.title,
+    ...coefficients.price,
+  ],
+  biases);
+
+  function extractProduct(doc) {
+    const extractedProduct = {};
+    const extractedElements = runRuleset(doc, rules);
+    if (hasAllFeatures(extractedElements)) {
+      for (const [feature, methods] of Object.entries(PRODUCT_FEATURES)) {
+        extractedProduct[feature] = methods.getValueFromElement(extractedElements[feature]);
+      }
     }
+    return hasAllFeatures(extractedProduct) ? extractedProduct : null;
   }
-  return hasAllFeatures(extractedProduct) ? extractedProduct : null;
+
+  return extractProduct;
 }
diff --git a/src/extraction/fathom/ruleset_factory.js b/src/extraction/fathom/ruleset_factory.js
index fd1ca4b..372b793 100644
--- a/src/extraction/fathom/ruleset_factory.js
+++ b/src/extraction/fathom/ruleset_factory.js
@@ -15,6 +16,64 @@ const ONEISH = 0.9;
  * easier testing.
  */
 export default class RulesetFactory {
+  constructor(visibleElements) {
+    this.visibleElements = visibleElements;
+  }
+
+  static isVisiblePreCheck(element) {
+    for (const ancestor of ancestors(element)) {
+      const style = getComputedStyle(ancestor);
+      const isElementHidden = (
+        style.visibility === 'hidden'
+        || style.display === 'none'
+        || style.opacity === '0'
+        || style.width === '0'
+        || style.height === '0'
+      );
+      if (isElementHidden) {
+        return false;
+      }
+    }
+    return true;
+  }
+
+  static async isVisiblePreProcessing() {
+    const visibleElements = new Set();
+    const isVisiblePromises = [];
+    const root = document.body;
+    const walker = document.createTreeWalker(root, NodeFilter.SHOW_ELEMENT, null, false);
+    let {currentNode} = walker;
+    while (currentNode) {
+      const {tagName} = currentNode;
+      const desiredTags = ['DIV', 'IMG', 'H1', 'SPAN', 'H2']; // based on our dom() LHS for each feature
+      if (!desiredTags.includes(tagName)) {
+        currentNode = walker.nextNode();
+        continue;
+      }
+      // Need block level scoping inside the while loop to capture the current iteration's
+      // currentNode value inside setTimeout
+      const node = currentNode;
+      // eslint-disable-next-line no-loop-func
+      const promise = new Promise(resolve => requestAnimationFrame(() => {
+        setTimeout(() => {
+          // This code will be run ASAP after Style and Layout information have
+          // been calculated and the paint has occurred. Unless something else
+          // has dirtied the DOM very early, querying for style and layout information
+          // here should be cheap.
+          const isVisible = this.isVisiblePreCheck(node);
+          if (isVisible) {
+            visibleElements.add(node);
+          }
+          resolve();
+        }, 0, resolve, node);
+      }));
+      currentNode = walker.nextNode();
+      isVisiblePromises.push(promise);
+    }
+    await Promise.all(isVisiblePromises);
+    return new RulesetFactory(visibleElements);
+  }
+
   getHighestScoringImage(fnode) {
     return fnode._ruleset.get('image')[0]; // eslint-disable-line no-underscore-dangle
   }
@@ -199,24 +258,7 @@ export default class RulesetFactory {
   }

   isVisible(fnode) {
-    for (const ancestor of ancestors(element)) {
-      const style = getComputedStyle(ancestor);
-      const isElementHidden = (
-        style.visibility === 'hidden'
-        || style.display === 'none'
-        || style.opacity === '0'
-        || style.width === '0'
-        || style.height === '0'
-      );
-      if (isElementHidden) {
-        return false;
-      }
-    }
-    return true;
+    return this.visibleElements.has(fnode.element);
   }

   hasBackgroundImage(fnode) {
diff --git a/src/extraction/index.js b/src/extraction/index.js
index db15199..10f3037 100644
--- a/src/extraction/index.js
+++ b/src/extraction/index.js
@@ -17,17 +17,6 @@ import extractProductWithOpenGraph from 'commerce/extraction/open_graph';
 import {shouldExtract} from 'commerce/privacy';
 import recordEvent from 'commerce/telemetry/content';

-/**
- * Extraction methods are given the document object for the page, and must
- * return either a valid ExtractedProduct, or null if a valid product could not
- * be found.
- */
-const EXTRACTION_METHODS = {
-  fathom: extractProductWithFathom,
-  css_selectors: extractProductWithCSSSelectors,
-  open_graph: extractProductWithOpenGraph,
-};
-
 let isBackgroundUpdate = false;
 try {
   isBackgroundUpdate = window.top.location.href.startsWith(
@@ -74,10 +63,10 @@ class ExtractionAttempt {
  * order until one of them returns a truthy result.
  * @return {ExtractedProduct|null}
  */
-function extractProduct() {
+function extractProduct(methods) {
   const attempt = new ExtractionAttempt();
   attempt.start();
-  for (const [methodName, extract] of Object.entries(EXTRACTION_METHODS)) {
+  for (const [methodName, extract] of Object.entries(methods)) {
     const extractedProduct = extract(window.document);
     if (extractedProduct) {
       attempt.succeed(methodName);
@@ -107,8 +96,8 @@ async function sendProductToBackground(extractedProduct) {
  * Checks to see if any product information for the page was found,
  * and if so, sends it to the background script.
  */
-async function attemptExtraction() {
-  const extractedProduct = extractProduct();
+async function attemptExtraction(methods) {
+  const extractedProduct = extractProduct(methods);
   if (extractedProduct) {
     await sendProductToBackground(extractedProduct);
   }
@@ -140,16 +129,25 @@ async function attemptExtraction() {
     return;
   }

+  // Extraction methods are given the document object for the page, and must
+  // return either a valid ExtractedProduct, or null if a valid product could not
+  // be found.
+  const methods = {
+    fathom: await extractProductWithFathom(),
+    css_selectors: extractProductWithCSSSelectors,
+    open_graph: extractProductWithOpenGraph,
+  };
+
   // Extract immediately, and again if the readyState changes.
-  let extractedProduct = await attemptExtraction();
+  let extractedProduct = await attemptExtraction(methods);
   document.addEventListener('readystatechange', async () => {
-    extractedProduct = await attemptExtraction();
+    extractedProduct = await attemptExtraction(methods);
   });

   browser.runtime.onMessage.addListener(async (message) => {
     if (message.type === 'reextract-product') {
       // Re-extract the product if requested
-      extractedProduct = await attemptExtraction();
+      extractedProduct = await attemptExtraction(methods);
       await sendProductToBackground(extractedProduct);
     } else if (message.type === 'resend-product') {
       // Re-send the already-extracted product if requested

[3]: Profiled on a locally hosted product page with specs from the "Version information" section in this comment.