perf: Optimize subscriptions for scans with a limit

grgbkr commented 3 years ago

Problem

Right now we do not use limit when determining if a scan is affected by a key.(https://github.com/rocicorp/replicache/blob/2d2340227184c058bcb49ed1070bf129eb7a0385/src/subscriptions.ts#L39).

We only use the startKey and prefix. This means that if we have something like:

rep.subscripe(tx => {
  for await (const entry of tx.scan({startKey: 'a', limit: 5}) {
     console.log(e);
  }
}, {onData() {});

we cannot tell whether a change to key 'x' should affect the subscription function.

Solution

Inside SubscriptionTransactionWrapper's scan method.

We currently only store a ScanOptions for each scan, update to store a ScanSubrscriptionInfo, consisting of ScanOptions and the new field inclusiveLimitKey. inclusiveLimitKey will be populated based on a callback driven by the scan implementation in btree/node.ts (this callback has to be threaded through a few layers)
Then in scanOptionsMatchesKey we know that a key does not match if there is a limit and the changed key is greater than the inclusiveLimitKey. This works for both prefix, startKey, and a combination of prefix and startKey.

Note this optimization is only applied when a subscription reads its scan to its limit (it choses to stop early, or it runs out of entries). We can optimize some cases where a subscription does not read its scan to its limit, however these optimization are only correct if the subscription body is a pure function on the replicache store state. Subscription bodies should be pure in this way, but so far replicache behavior is correct even if they are not pure. We chose to not implement these further optimizations, erroring on the side of correctness over performance. We can of course revisit this if it turns out not reading a scan to its limit is a common use case.

Perf measurements

This greatly improves writeSubRead benchmark performance when random invalidates are used, reducing the median time by ~70%. This benchmark uses 128 scans each using startKey and limit.

Made on my M1 Max w 64 GB of memory

Before this optimization:

[MemStore] writeSubRead randomInvalidates false 100MB total, 128 subs total, 5 subs dirty, 16kb read per sub 50/75/90/95%=2.50/4.00/11.40/11.40 ms avg=4.56 ms (7 runs sampled)
[MemStore] writeSubRead randomInvalidates true 100MB total, 128 subs total, 5 subs dirty, 16kb read per sub 50/75/90/95%=9.50/12.70/19.70/19.70 ms avg=12.69 ms (7 runs sampled)

With this optimization:

Running 2 benchmarks on Chromium...
[MemStore] writeSubRead randomInvalidates false 100MB total, 128 subs total, 5 subs dirty, 16kb read per sub 50/75/90/95%=2.50/4.20/10.70/10.70 ms avg=4.49 ms (7 runs sampled)
[MemStore] writeSubRead randomInvalidates true 100MB total, 128 subs total, 5 subs dirty, 16kb read per sub 50/75/90/95%=3.00/4.40/11.10/11.10 ms avg=4.99 ms (7 runs sampled)
Done!

Closes #666

vercel[bot] commented 3 years ago

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/rocicorp/replicache/ryH5cyL4m9oPTK3JbzAVkWBxpMP2
✅ Preview: https://replicache-git-grgbkr-666-end-exclusive-key-rocicorp.vercel.app

grgbkr commented 3 years ago

@arv this is ready for review now. Let me know if you can think of any other tests to write.

I also don't know too much about subscriptions/scan and indexes . Are there any considerations around indexes we are missing?

rocicorp / replicache