Open RaduBerinde opened 4 years ago
You really need an option to stop the scan at a range boundary once there's been a result, right? If the first range it hits is empty you want to keep going.
Or were you thinking you'd have the fetcher do the pagination directly?
Maybe what we should do is abstract over all of the various ways you may want to consume results and let the author of the batch decide. For example, you could pass in a closure that gets to view the new result every time you get one (i.e. roughly once per range, absent limits), and decides whether that's enough.
You really need an option to stop the scan at a range boundary once there's been a result, right?
Correct.
@drewdeally ran into this again. It would be great if we could get something done in this release. Could find an assignee for the issue?
Hm, to me it seems like the work is mostly in the KV layer with SQL Execution being responsible for reusing the new API. cc @asubiotto @jordanlewis
FWIW, we used to have an API for scanning until the end of a range, but we deleted it cause it wasn't used. But that's at the level of a request to a single range. It seems to me that what we want here, more importantly, is control over how the DistSender fans out requests with limits - should it speculatively fan them out with parallel requests or should it read range by range in some order.
I'm working on some related changes for #70564 and #68050, will see if I can address this while I'm at it. From what's said above, it sounds like #70763 might fit the bill. I'm curious though:
In general, we need to fetch one more key so that we're sure we got all keys for a row (if there are multiple column families). But we could get the same signal if we knew that we hit the end of a range.
Could you elaborate a bit on this? How would the range boundary help? Presumably we're using a prefix-style scan to get all keys for a row, and we either need to get all of them or none of them? And if we're not using a prefix-style scan, what's stopping the keys of interest from straddling the range boundary?
There is an unfortunate invariant that SQL forces on us which is that SQL rows don't straddle range boundaries.
(sent from mobile, please excuse my brevity)
On Tue, Sep 28, 2021, 11:29 Erik Grinaker @.***> wrote:
I'm working on some related changes for #70564 https://github.com/cockroachdb/cockroach/issues/70564 and #68050 https://github.com/cockroachdb/cockroach/issues/68050, will see if I can address this while I'm at it. From what's said above, it sounds like
70763 https://github.com/cockroachdb/cockroach/pull/70763 might fit
the bill. I'm curious though:
In general, we need to fetch one more key so that we're sure we got all keys for a row (if there are multiple column families). But we could get the same signal if we knew that we hit the end of a range.
Could you elaborate a bit on this? How would the range boundary help? Presumably we're using a prefix-style scan to get all keys for a row, and we either need to get all of them or none of them? And if we're not using a prefix-style scan, what's stopping the keys of interest from straddling the range boundary?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cockroachdb/cockroach/issues/47044#issuecomment-929018841, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGXPZC6UFIO4H5I5O2AFVTUEGDF5ANCNFSM4L7LH57Q .
A KV API to return partial results on range boundaries has been implemented in #70763. Leaving this issue open to update the SQL bits.
@yuzefovich any thoughts on if this is relevant/impactful?
This is still present, and it is the most impactful in multi-region deployments when the table has multiple column families (which is probably an edge case, so we can wait until a customer explicitly complains about it).
Here is a quick example to show the impact:
CREATE TABLE t (k INT PRIMARY KEY, v1 INT, v2 INT, FAMILY (k, v1), FAMILY (v2));
ALTER TABLE t SPLIT AT VALUES (2);
INSERT INTO t VALUES (1, 1, NULL), (2, 2, NULL);
EXPLAIN ANALYZE SELECT * FROM t LIMIT 1;
-- KV pairs read: 2
To satisfy this query we had to fetch both keys present in the table, even though the second key is on a separate range and cannot be part of the row contained within the first range.
A user saw a case where selecting all rows from a (small) partitioned table was significantly faster than selecting one row using LIMIT 1.
The reason was that the first region was also the farthest away, and it only had one row. There is a single TableReader planned in that region (because of the limit); the kv fetcher requests two keys in this case, and the scan ends up going to a range on another region.
In general, we need to fetch one more key so that we're sure we got all keys for a row (if there are multiple column families). But we could get the same signal if we knew that we hit the end of a range.
~Having a KV API that allows stopping a scan at the end of the range would be useful here. CC @tbg @andreimatei who have been thinking about the APIs between KV and SQL.~
EDIT (@erikgrinaker): The KV API is available as of #70763.
Jira issue: CRDB-5048