dolthub / dolt

Dolt – Git for Data
Apache License 2.0
17.84k stars 508 forks source link

Implement "Advanceable" RowIter #6661

Open nicktobey opened 1 year ago

nicktobey commented 1 year ago

(I'm not calling this "fast-forwardable" because apparently that's already a term in Microsoft SQL Server that has a different meaning.)

Currently, every join plan iterates over every row in the primary provider. This becomes a performance bottleneck for joins where only a small number of rows in the primary actually match. Imagine a join where the join condition is extremely selective, and only a small fraction of the cartesian product actually ends up in the result. We currently have no join strategy where the cost of the join scales with the cardinality of the result, because it has to scale with the cardinality of at least one of its inputs.

One potential solution would be to add an optional interface to RowIters that exposes a Advance or FastForward method, which takes a key and advances the cursor up to that key. Thus, the iter behaves like a range lookup with multiple ranges, except that those ranges can be determined dynamically during the join.

The cursor's in Dolt's storage layer already have the ability to efficiently "advance". It's what makes efficient diffs possible. If we expose this functionality to the SQL layer, then IndexedTableLookups can implement this new interface, and this can allow us to get similar benefits to merge joins.

Currently, merge joins are implemented with a loop: each iteration of the loop advances either the left or the right RowIter, based on the relative values of their keys. But we can only advance an iter a single step in each loop, and then we have to compare the keys again. This is what limits the performance to the cardinality of the input iters.

Alternatively, imagine we're going a merge join between two tables on their primary keys. (Table A has primary key a, and Table B has primary key b.) Now our merge join implementation looks like this:

Limitations

We get the most benefit from this when joins are highly selective. However, without better statistics, we don't really have the ability to reason about when a join is highly selective. So we currently have a hard time detecting when this makes a difference and would have a hard time incorporating it into the coster.

That said, there's no penalty for fast forwarding an iter: even in the worst case it's no slower than doing a comparison on every row, which is what we currently do for merge joins. So for cases where cost analysis already selected a merge join, this gives us additional benefit for basically free.

Implementation Strategies

Option A: Like described above, we create a AdvanceableRowIter interface that extends RowIter and provides the methods Advance(key []interface{}) and SupportsAdvance() bool methods. SupportsAdvance allows us to have RowIters that conditionally support advancing based on the type of their child RowIters.

Option B: Instead of creating a new interface, we add a nilable key parameter to RowIter::Next(). This parameter serves as a hint to the RowIter that the parent doesn't need any rows between the current row and the next row with the supplied key. This is not a guarentee that the next row will have that key: RowIter implementations can always safely ignore this parameter, and RowIter users cannot assume that the next row will have the desired key. RowIters that aren't ordered simply ignore the parameter.

I'm personally leaning toward B. It technically refactors every RowIter implementation, but it's light and unobtrusive.

Considered Alternatives

We could avoid changing RowIters at all by changing the behavior of merge joins to create new RangeLookups on the child nodes whenever the key changes. But this would complicate the merge join logic even further, be error prone (for instance, how do we know when we've reached the end of the child tables?), and potentially more expensive. The previous options let us separate the logic more cleanly.

This option would also limit the benefit to merge joins, while the previous options could potentially have other benefits.

Further Work

Implementing this is a prerequisite if we want to support Zig Zag Joins.

Priority

This isn't a priority until we encounter a query from a client or benchmark that would benefit from it. I just wanted this to be documented.

max-hoffman commented 1 year ago

A simple implementation could be fairly liberal disposing/recreating range scans, also. You could imagine "advance" tossing the current range scan for a merge join, and recreating it with the new key.

The storage layer has an interface Cursor.Seek, which advances along leaf chunks for the first primary key gt/eq a comparison value. We currently use this for point lookup. A prototype could overload sql.IndexLookup to support the specific behavior you are looking for; something in-between a range scan and a strict point lookup. LookupBuilder might also need to be edited to support that prototype.

This is also really useful single range scans on tables, in addition to joins. We can hop between different indexes on the same table for a collection of selective filters that apply to disjoint indexes.