cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.89k stars 3.78k forks source link

sql: provide syntax to scan an index directly #59549

Open ajwerner opened 3 years ago

ajwerner commented 3 years ago

Is your feature request related to a problem? Please describe.

It'd be valuable for testing and debugging to have a primitive exposed to SQL to scan an index and return some representation of the KVs it contains. Today we have syntax to try to force a scan on an index but it doesn't provide low-level access to the KVs in the index.

Describe the solution you'd like

I'm not exactly sure what sort of syntax would be best. We could have some new syntax in the grammar or we could have some sort of magical function.

Jira issue: CRDB-3275

RaduBerinde commented 3 years ago

Perhaps it could be a variant of SHOW RANGES. One question is whether we would allow constraining the index to a specific range or if it's ok to always show the entire index (it may not be important for writing tests, but it may be important when debugging a specific issue on a real deployment).

ajwerner commented 3 years ago

On some level, it feels like what we want is a primitive to read raw KVs from a key span. Then, perhaps, we add some functions to make decoding keys into a usable form from SQL. The set of builtins we might add might get large and hard to use.

ajwerner commented 3 years ago

Another relevant question is whether we'd ever want to read a set of KVs that spans more than one index rather just constraining it further.

fqazi commented 3 years ago

Start off implementing this as a builtin then add support for inside the optimizer to build something better afterwards.

ajwerner commented 3 years ago

Additional context (internal)

ajwerner commented 3 years ago

The big questions I have here is whether we'd want to provide

1) Access to the datums stored in the KVs. 2) Mechanisms to provide predicates to constraint spans or prune values.

I'm assuming, for now, that the answer to both is no.

@RaduBerinde, at your leisure, can you see if this all lines up with your expectations?

RaduBerinde commented 3 years ago

I agree the tuple generating builtin is the most reasonable direction. I think the apparent limitations can be addressed:

  1. The datums in the KV could be part of the resulting tuple. The tuple could contain the datums in the key as an "inner" tuple, and the datums in the value as another inner tuple. E.g. (raw_key, raw_value, (1, 'foo'), ('bar', 10, 'baz').
  2. We could have another built-in for encoding a key according to a table and index, and the tuple generating built-in could optionally take start and end keys. For example crdb_internal.encode_key(tab, idx1, 1, 'foo'). Perhaps the built-in always functions like this, needing keys generated with other primitives instead of taking a table/index.

Ultimately though, we should have a list of specific usecases for the feature to make sure they are covered by the solution.

ajwerner commented 2 years ago

I think about this issue from time to time when I think about how hard it can be to introspect our key encoding. I think this would be a wonderful tool when combined with some decoding builtins to write all manner of low-level introspection and validation tooling.

ajwerner commented 2 years ago

I'm putting this on the SQL Queries board because I think it's a great issue. No urgency or anything.

ajwerner commented 2 years ago

This would also provide a path to solving https://github.com/cockroachdb/cockroach/issues/54481.

github-actions[bot] commented 1 year ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!