cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.22k stars 3.82k forks source link

sql: force column types in view plans (make view descriptors future-proof) #12611

Open knz opened 7 years ago

knz commented 7 years ago

The problem is as follows; suppose at some time we define a view:

create view v as select somefunc(x+3, y) from ...

then at some later point either of the following happens:

Then any uses of the view will discover a new type for the columns, effectively changing the semantics of the view. In some cases it could even break (e.g. if the overload resolution for somefunc has become ambiguous)

This is because the code that re-creates the query plan from the view descriptor infers the types using the text of the original view query using the new typing rules.

There are two parts for a full solution to this, but the first will go a long way to make views future-proof:

  1. give the column types stored in the view descriptor as "demanded types" when instantiating the view plan. This will bias type checking strongly in favor of preserving the previous types and catch most problematic cases.
  2. annotate the syntax tree preserved in the view descriptor with the types of every expression node and the specific builtin overload used for each function call, then skip type checking entirely when instantiating the query plan from the view descriptor.

Jira issue: CRDB-6128

nvanbenschoten commented 7 years ago

I was actually just thinking about this, and I think your two proposed changes sound solid (although I don't think the first one is necessary if we do the second one). They would address your first and third potential issue. However, I don't think they would address the second, which is more tricky and may require function versioning.

IIRC correctly, most SQL databases like Postgres will compile persisted statements to an internal representation before storing. This allows them to avoid these kinds of issues. If we're going to be re-parsing and typing our serialized format, fully type-annotating the expression tree seems like the right approach.

Also, I think we should be doing this in cases like CHECK constraints as well. Anywhere that statements need to be persisted will face these complications.

knz commented 7 years ago

Yeah perhaps I should mention a few of these ideas in the IR RFC too.

knz commented 7 years ago

This needs #15388 which I bumped to 1.1, so bumping this one to 1.1 as well.

This needs to be documented as a known limitation though.

cuongdo commented 7 years ago

@knz how is this looking for 1.1?

knz commented 7 years ago

It's queued after #17501 merges. I think once #17501 is in I can cook a small fix.

jordanlewis commented 7 years ago

Not making it into 1.1.0. Under consideration for inclusion in a subsequent minor release along with several other view bugfix PRs.

jordanlewis commented 4 years ago

@rohany this one too. @RaduBerinde, just noticed you triaged this in the Optimizer board - I don't think you were planning to deal with it soon though, right?

RaduBerinde commented 4 years ago

No, I had just triaged all opt issues.

github-actions[bot] commented 1 year ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!