matatk / landmarks

Allows you to navigate a web page via WAI-ARIA landmarks, using the keyboard or a pop-up menu
http://matatk.agrip.org.uk/landmarks/
MIT License
124 stars 7 forks source link

A forest of trees! #389

Open matatk opened 4 years ago

matatk commented 4 years ago

Right, I defo want to move to using a tree to display the landmarks (and ultimately headings) (#238). (That could also be used to display enhanced info in DevTools.)

Also, have been thinking more about this recently, but it's very important to ensure the extension is as efficient as possible, not just in terms of how the user feels (which I think is likely to be pretty good now) but also in ensuring it uses the least CPU time, and therefore also energy, possible. I think it's worth trying out a tree-based approach, in which only subtrees of the DOM are scanned when relevant mutations happen. However, some concerns:

My assumption is it would be worthwhile to move to:

How to get there (and check that getting there is a good idea)?

Test current performance

Move to a tree-based data structure

This would make rendering easier. The current DOM walk guarantees DOM order, which is nice and simple. Was thinking of using querySelectorAll() but not sure if that guarantees the order, and it would involve even bigger code changes. Can always try it later.

In order to be useful for finding the closest parent landmark to an element where a mutation has happened, we need to:

  1. Find the nearest parent landmark to a mutation (which could be the element that was mutated). This can be done in the DOM using data-* attributes.
  2. Map the looked-up custom attribute value to an actual in-memory HTML element—or (sub-)root of a landmarks JS object tree.
  3. Scan the document from that point down.
  4. Replace that (sub-)tree of landmark results.
  5. Send the whole lamdmarks tree to be rendered (later: this can be refined to only re-render what's needed).

Seems we need:

  1. Custom data-* attributes.
  2. A map from data-* attribute value to JS object that stores a (sub-)tree.
  3. A root JS object that stores all the landmarks.

It's possible to get a reference to the JS object so this is sounding do-able!

Steps

  1. Check that the idea of using the DOM, an attribute-value-to-subtree mapping and replacing parts of the tree actually works (quick prototype).
  2. Test the current performance.
  3. Move to using trees and check the performance.

Things that could build on this in future (some mentioned above):

And much later, or "if there's time" :-)...

matatk commented 3 years ago

Was thinking that one way to test the performance both simply and cross-browser would be to look at the number of document nodes visited when using the tree-based approach above, and not. This would eliminate the need to test in real browsers, and make the results more portable.

I've just come across other ways to traverse the DOM: NodeIterator and TreeWalker. These could be really interesting and reduce the code needed in Landmarks (and maybe negate the need the the approach above) but to measure their performance IRL the tests would have to be done... IRL.

(I did find a gist about the performance of TreeWalker, but it's not current.)

So... still thinking on how to orchestrate the test harness.

matatk commented 3 years ago

The fundamental question around whether it's worth changing approach so radically is how to cope with mutation guarding: will this be fast enough such that full-page scans are never needed? I can't imagine so, because some pages will change loads, and those mutations need to be ignored.

So the threshold for ignoring mutations would be made a lot more permissive (but maybe with a much stronger back-off), and the question becomes: is scanning only the part of the documented that was mutated, but more times, more efficient than scanning the whole document far fewer times?

PRs #407 and #408 have developed the profiling more (and now Firefox is supported experimentally by Puppeteer—exciting!) Was thinking that now multiple LandarksFinders are sort-of supported (they are, but it's hardcoded in #410) it would be interesting to run a full-page and tree-based scanner at once and check the stats in the DevTools about how they're working. But that does require having at least prototyped the tree-based one, and that requires time that would ideally need to be justified by clear, or very strongly expected, performance gains...