Double hot path perf, halve payload size, maintain readability

chenglou commented 2 years ago

Follow-up to https://twitter.com/_chenglou/status/1570895293784391681 and https://twitter.com/_chenglou/status/1567269585585606659 which turned into some nice discussions.

PR not meant to be merged. Feel free to take various bits back into the codebase.

The goal is to improve tldraw and perfect-freehand’s UX and DX through perf improvements. Generally, perf can be obtained either by complicating things, or by simplifying things. Imo we don’t do enough of the latter, so I’d like to show some examples here.

getSvgPathFromStroke turns a list of coordinates into a path string for SVG. It’s the hottest code in perfect-freehand, which is used in tldraw, and one of the hottest there as well, as Steve profiled.

First commit

This is a classic graphics programming array-of-structure to structure-of-array transform where we turn the points data structure Array<[number, number]> into {xs: Array<number>, ys: Array<number>}. You don’t need this. I believe we’re settling with Array<{x: number, y: number}>, which is better readability and good enough perf increase compared to the current format. For more info, see the previous 2 links.

(There’s a 0th commit which turns string appends into string interpolation, also elaborated in the first tweet. Together they improve this hot path by 8-10%).

Second commit

This turns toFixed(3) into toFixed(2) for another 10% overall perf boost (!). toFixed is disproportionally the hottest call here, even if it’s already the most performant way to turn numbers into strings (again, benchmark to see whether it’s true for your own codebase. It’s likely not). A third decimal point's precision isn’t needed. I superimposed complex curves on 1. macOS with 2. Studio Display on 3. Safari with 4. pinch zoom, the extreme quadfecta for stress testing crisp curves, and I've only spotted minor anti-aliasing differences. Code-wise, no readability decrease.

Third commit

toFixed is so hot that there’s basically no point spending effort optimizing anything else. So the third commit aims at it. We’re using the Q command of <path>’s d attribute to create quadratic Bézier curves passing by average(point_i, point_(i+1)). Each Q command takes in 4 numbers that require toFixed stringification. I was thinking that, if there’s a way to use fewer numbers, then we’d have solved this bottleneck. Fortunately there’s exactly T, which implicitly assumes the control point is a reflection of the previous one (good illustrations). But, an average of two points exactly establishes such reflection 🤩. Now instead of using 4 Q numbers, we use 2 T numbers. This also improves readability. Scanned Documents

Halving the numbers halved the calls to toFixed, which gives 200% perf boost along with the earlier commits. These also halved the payload size for the path string. And since perfect-freehand and tldraw’s payload sizes (?) are dominated by Bézier curve paths, this theoretically halves all downstream sharing of their output. Note: real gains will be smaller because, real world.

Lessons

Profile first. One’s chance of guessing exactly the right line of code causing problems in a big codebase nears 0%. Conversely, any kind of architecture, dependency and caching added without profiling are likely based on the wrong assumptions. Thankfully, Steve did here.
Generic optimizations are nice when they improve readability too.
But nothing beats knowing the domain and applying specific optimizations for performance and readability. This has been true for https://twitter.com/_chenglou/status/1570895293784391681 as well, where the domain-specific knowledge of sin and cos in getOutline blew all the other generic optimizations out of the water. The stronger your codebases’ assumptions, the better. As an aside, imo folks' tendency of using generic abstract architectures, either by naively "future-proofing" code, or excessively dragging dependencies (by definition generic), removes critical opportunities for perf and readability improvement, causes extra bugs, which in turns make true abstractions harder to find.

Btw, these changes look simple (that’s the goal) but could have gone out of control if we weren’t careful. None of this acrobatics would have been necessary if DOM was fast and/or took a data structure of points instead of asking us to serialize them into a string, only to be deserialize again anyway.

vercel[bot] commented 2 years ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated
perfect-freehand-example	❌ Failed (Inspect)		Sep 18, 2022 at 9:13AM (UTC)
pf-dev	❌ Failed (Inspect)		Sep 18, 2022 at 9:13AM (UTC)

steveruizok commented 2 years ago

So my first thought is wow! This has the potential to really improve the performance of apps that use perfect freehand. The library’s methods have always been very fast, however turning the data it returns (points that describe the contour of the “stroke” polygon) into SVG paths has been a bottleneck on longer lines. In other words, while creating an SVG path from the data is significantly slower than generating the data itself, it’s necessary to actually use the data in SVGs, so improving that speed would improve the “real speed” of using the library.

The optimized method makes some assumptions about the format of the data coming out of the library. To really make the most of its improvements, we would also want the library to return an object with the same shape as the new conversion method’s arguments (ie { xs: number[], ys: number[] }). As it is now, the library returns number[][].

I’m all for a version a version that returns data in this format, so long as we have corresponding “rendering examples” like the one in this PR. (My guess is that most folks are copying and pasting our implementations).

chenglou commented 2 years ago

These 3 changes are independent of each other, with the last path string change being the biggest improvement; it also doesn't break APIs. I suggest applying the last 2 changes onto the existing data format instead.

The first, the SoA change, does break APIs, and I wouldn't do it unless it's also good for ergonomics. That'd require looking into third-party usages of perfect-freehand and see whether folks' data pre-massaging are closer to number[][], {x: number, y: number}[] or {xs: number[], ys: number[]}.

steveruizok commented 2 years ago

Ah I see, I’d just looked at version in files changed. Perhaps we should also add benchmarks to the library?

chenglou commented 2 years ago

It's up to you! My opinion is that some manual profiling like you've done is good enough, then there are bigger fish to fry. You can just keep aiming for changes that increase readability while also casually checking that they're not regressing perf. Since these have proven to be good enough, you can disregard all other optimizations and benchmarking efforts, which usually take more effort than they're worth.

But again, I'm just passing by really, so you do whatever! 😃

TodePond commented 2 years ago

Hey am I doing something wrong? Any help is greatly appreciated :) I've tried to translate the 'magic' code to the number[][] format. The following function isn't intended to be performant - it's just intended to check that I've got it working:

function getSvgPathFromStroke(points: number[][]): string {

    const xs = points.map((point) => point[0])
    const ys = points.map((point) => point[1])

    const len = xs.length

    // TODO: support smaller lengths - but just ignore for now while we measure
    if (len < 3) {
      return ''
    }

    let x0 = xs[0], y0 = ys[0], x1 = xs[1], y1 = ys[1], x2 = xs[2], y2 = ys[2]
    let result = `M${x0.toFixed(2)},${y0.toFixed(2)}
      Q ${x1.toFixed(2)},${y1.toFixed(2)} ${average(x1, x2).toFixed(2)},${average(y1, y2).toFixed(2)}
      T `;

    for (let i = 0, max = len - 1; i < max; i++) { // TODO: bound check, start at > 0, etc.
      result += `${average(xs[i], xs[i+1]).toFixed(2)},${average(ys[i], ys[i+1]).toFixed(2)} `;
    }

    result += 'Z'

    return result
  }

But it ends up making wobbly lines:

I assume I've missed something or made some mistake - any ideas? No worries if not, I'll take a look at it with fresh eyes later :)

steveruizok commented 2 years ago

I think this is it:

function getSvgPathFromStroke(points, closed) {
  const len = points.length

  if (len < 4) {
    return ``
  }

  let a = points[0]
  let b = points[1]
  const c = points[2]

  let result = `M${a[0].toFixed(2)},${a[1].toFixed(2)} Q${b[0].toFixed(2)},${b[1].toFixed(
    2
  )} ${average(b[0], c[0]).toFixed(2)},${average(b[1], c[1]).toFixed(2)} T`

  for (let i = 2, max = len - 1; i < max; i++) {
    a = points[i]
    b = points[i + 1]
    result += `${average(a[0], b[0]).toFixed(2)},${average(a[1], b[1]).toFixed(2)} `
  }

  if (closed) {
    result += 'Z'
  }

  return result
}

steveruizok commented 2 years ago

Benched here against the previously fastest implementation: https://jsbench.me/0rl88xs7a8/1

TodePond commented 2 years ago

Thanks! I'll give it a go :)

TodePond commented 2 years ago

Nice it worked! the magic approach seems fast in my benchmark :)

steveruizok / perfect-freehand