Siteimprove / alfa

:wheelchair: Suite of open and standards-based tools for performing reliable accessibility conformance testing at scale
MIT License
102 stars 11 forks source link

Optimize accessible name calculation #1611

Closed rcj-siteimprove closed 1 month ago

rcj-siteimprove commented 2 months ago

This PR fixes two optimization problems in the call hierarchy of Name.fromDescendants:

  1. Name.fromSteps would potentially call the same functions two times unnecessarily due to an .orElse. This was very bad because the collection of functions passed in would contain up to two functions which would call fromDescendants recursively making the traversal grow extremely fast for some certain inputs.
    • Fixing this brought the number of recursive calls for some test input down from ~8.000.000 to ~3.000 and the run time from ~50.000 ms to ~200 ms
  2. Name.fromElement would potentially recurse into Name.fromDescendants two times with identical arguments
    • Fixing this additionally brought the number of recursive calls for the same test input from ~3.000 down to ~100 and the run time to ~50 ms.

In the process of troubleshooting, I in-lined and thus eliminated the Name.fromSteps call in Name.fromElement. Despite making fromElement more complex, I decided to keep this in-lined because I think passing functions that is part of a recursive loop as arguments into another function which then just invokes them anyway, makes the flow very hard to follow and makes these kinds inefficiencies hard to spot.

However Name.fromSteps is called from other places, so it has nonetheless been optimized with a cache to avoid unnecessary re-computation. We could consider in-lining this the few other places it's used using the same approach and eliminate it completely.

I have taken care not to change functionality and the test suite didn't detect any regressions, but we might still want to have this regression tested.

changeset-bot[bot] commented 2 months ago

🦋 Changeset detected

Latest commit: 4e36f7b47bafd5f66d30d43bc43b994abc30cc34

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 78 packages | Name | Type | | ------------------------------- | ----- | | @siteimprove/alfa-aria | Patch | | @siteimprove/alfa-act | Patch | | @siteimprove/alfa-affine | Patch | | @siteimprove/alfa-applicative | Patch | | @siteimprove/alfa-array | Patch | | @siteimprove/alfa-bits | Patch | | @siteimprove/alfa-branched | Patch | | @siteimprove/alfa-cache | Patch | | @siteimprove/alfa-callback | Patch | | @siteimprove/alfa-cascade | Patch | | @siteimprove/alfa-clone | Patch | | @siteimprove/alfa-collection | Patch | | @siteimprove/alfa-comparable | Patch | | @siteimprove/alfa-compatibility | Patch | | @siteimprove/alfa-continuation | Patch | | @siteimprove/alfa-css-feature | Patch | | @siteimprove/alfa-css | Patch | | @siteimprove/alfa-device | Patch | | @siteimprove/alfa-dom | Patch | | @siteimprove/alfa-earl | Patch | | @siteimprove/alfa-either | Patch | | @siteimprove/alfa-emitter | Patch | | @siteimprove/alfa-encoding | Patch | | @siteimprove/alfa-equatable | Patch | | @siteimprove/alfa-flags | Patch | | @siteimprove/alfa-fnv | Patch | | @siteimprove/alfa-foldable | Patch | | @siteimprove/alfa-functor | Patch | | @siteimprove/alfa-future | Patch | | @siteimprove/alfa-generator | Patch | | @siteimprove/alfa-graph | Patch | | @siteimprove/alfa-hash | Patch | | @siteimprove/alfa-http | Patch | | @siteimprove/alfa-iana | Patch | | @siteimprove/alfa-iterable | Patch | | @siteimprove/alfa-json-ld | Patch | | @siteimprove/alfa-json | Patch | | @siteimprove/alfa-lazy | Patch | | @siteimprove/alfa-list | Patch | | @siteimprove/alfa-map | Patch | | @siteimprove/alfa-mapper | Patch | | @siteimprove/alfa-math | Patch | | @siteimprove/alfa-media | Patch | | @siteimprove/alfa-monad | Patch | | @siteimprove/alfa-network | Patch | | @siteimprove/alfa-option | Patch | | @siteimprove/alfa-parser | Patch | | @siteimprove/alfa-performance | Patch | | @siteimprove/alfa-predicate | Patch | | @siteimprove/alfa-promise | Patch | | @siteimprove/alfa-record | Patch | | @siteimprove/alfa-rectangle | Patch | | @siteimprove/alfa-reducer | Patch | | @siteimprove/alfa-refinement | Patch | | @siteimprove/alfa-result | Patch | | @siteimprove/alfa-rules | Patch | | @siteimprove/alfa-sarif | Patch | | @siteimprove/alfa-selective | Patch | | @siteimprove/alfa-selector | Patch | | @siteimprove/alfa-sequence | Patch | | @siteimprove/alfa-set | Patch | | @siteimprove/alfa-slice | Patch | | @siteimprove/alfa-string | Patch | | @siteimprove/alfa-style | Patch | | @siteimprove/alfa-table | Patch | | @siteimprove/alfa-test | Patch | | @siteimprove/alfa-thenable | Patch | | @siteimprove/alfa-thunk | Patch | | @siteimprove/alfa-time | Patch | | @siteimprove/alfa-toolchain | Patch | | @siteimprove/alfa-trampoline | Patch | | @siteimprove/alfa-tree | Patch | | @siteimprove/alfa-trilean | Patch | | @siteimprove/alfa-tuple | Patch | | @siteimprove/alfa-url | Patch | | @siteimprove/alfa-wcag | Patch | | @siteimprove/alfa-web | Patch | | @siteimprove/alfa-xpath | Patch |

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

rcj-siteimprove commented 1 month ago
  • It looks like Name.fromDescendants could be cached, which would solve the 2F/2H duplication.

Right, I'll do that in stead 👍🏼

  • Would it make sense to properly define the monad that fromSteps is trying to implement? We'd have something like a glorified Option (to handle the "first non-empty otherwise first empty" logic) and could just do a .orElse on it. OTOH, the full point about looking back from the start makes this weird 🤔

Might be interesting to look into, but I think for another PR.

  • The caching in fromSteps might be easier to do or understand by storing the results in an array. Since collectFirst's mapper accepts the index, it's easy to store the value during the first pass and retrieve it during the second.

I'll look into this.

  • It may also make sense to extract the individual steps (2B, …) into separate functions (not sure about how much is locally shared), so that the fromSteps call in fromElement is more streamlined (and the full function smaller) (it would also allow testing of individual steps if needed).

I thought about that, and tried somewhat to do it, but it got a little messy. I got a little confused about what a None actually means when returned from the subfunctions. Does it mean that the function was "applicable" and returned a (possibly empty) name or if they were not applicable and the next step should be executed (it somewhat seems like a mix 🙈). It looks like, at least in some cases, that we would need to have the function return an Option<Option<T>> (or Option<T> | null 🙈) where the outer option would signal if the step was applicable and the inner would signal if the name was empty or not. It get's even more confusing, in my opinion, since there are also actual empty names i.e. instances of Name with an empty value (with or without spacing). In my opinion, the whole computation could use a little TLC (it seems to be a breeding ground for bugs anyway 😅).

In this PR, I think it's best to focus on fixing the inefficiency, and then we can follow up with other improvements.