ccorcos / tuple-database

406 stars 19 forks source link

question: reactivity #28

Closed theSherwood closed 12 months ago

theSherwood commented 12 months ago

I'm currently working on an application that makes use of tuple-db's reactivity and I've run into a mismatch with what I need vs what tuple-db provides.

So tuple-db provides reactive queries. When there is an insertion, edit, or deletion in the db that would affect the result of a reactive query, that query is rerun, reproducing the entire result set. Conceptually, it seems that there are a few things at work here:

  1. A query is registered with a callback/watchers
  2. A change is detected that intersects with a query
  3. The query runs, producing a result
  4. Watchers are called with the new result

Is it possible to decouple those things? Something more like this:

  1. A range (or set or ranges) is registered with a callback/watchers
  2. A change is detected that is intersects with a range
  3. Watchers are called with the change(s)

If such a thing were possible, then the existing behavior of simple reactive queries could be a simple convenience wrapper on top.

For my purposes rerunning the query is an overhead I don't want to pay and it's not the information I'm interested in. What I'm generally interested in is the edit/insertion/deletion that invalidates some result set. Is what I'm describing possible within tuple-db's architecture and would you be open to such a PR?

ccorcos commented 12 months ago

Yeah, so this is exactly how tuple-database currently works: https://github.com/ccorcos/tuple-database/blob/aa18612e57fe4946df99866e7f3e7bf40cbe87b8/src/database/async/AsyncReactivityTracker.ts

Ideally, we'd use an interval tree here for this algorithm. But right now we're using a strategy called space partitioning that is tied directly to the tuple structure.

It works like this:

Given a query that looks like this: {gt: [a, b, c], lt: [a, b, d]}, we figure out the common prefix tuple which is [a, b] and register this query to that prefix.

Then when we write to the database, something like [a, b, x, y], then we search for listeners on all prefixes of that tuple: [a], [a, b], [a, b, x], and [a, b, x, y]. Then, we double-check the ranges before we emit. In this case, we'll find the listener on the [a, b] prefix, but we won't end up emitting on it because [a, b, x] is not in the specified range {gt: [a, b, c], lt: [a, b, d]}.

In the most general case where the tuples are all length-1, this algorithm has poor performance. But in practice, developers end up nesting tuples in a way that partitions data to make this algorithm fairly performant.

As I said, an interval tree is the proper solution here though! We'll do that one day 😄

theSherwood commented 12 months ago

Yeah, so this is exactly how tuple-database currently works: https://github.com/ccorcos/tuple-database/blob/aa18612e57fe4946df99866e7f3e7bf40cbe87b8/src/database/async/AsyncReactivityTracker.ts

Awesome! I've been using the subscribe interface but seems like I should just use the AsyncReactivityTracker and/or corresponding synchronous ReactivityTracker directly, then. I'll take a swing at that.

As I said, an interval tree is the proper solution here though! We'll do that one day 😄

I'm guessing you mean "rtree" here, right?

ccorcos commented 12 months ago

I'm guessing you mean "rtree" here, right?

An interval tree is all I would need since these are 1-dimensional ranges. A 1-dimensional r-tree is an interval tree (I think).