cockroachdb / cockroach

CockroachDB - the open source, cloud-native distributed SQL database.
https://www.cockroachlabs.com
Other
29.22k stars 3.65k forks source link

sql: facilities to check to rewrite computed column expressions #68202

Open RaduBerinde opened 2 years ago

RaduBerinde commented 2 years ago

We are seeing multiple cases where computed column expressions need to be fixed before an upgrade to a newer version is possible. This is due to various backward-incompatible SQL changes, like removing deprecated builtins or correcting the immutability property of operators.

One big problem is that you won't know that something is wrong until after the upgrade. Then the table can stop working altogether, or the optimizer no longer generates the necessary plans.

We need:

CC @vy-ton

Jira issue: CRDB-8929

vy-ton commented 2 years ago

fyi @otan

@ajwerner Can we treat this as a generalized scenario of the interleaved feature removal?

Version Behavior
N-2 Feature is deprecated and new usage is still enabled by default
N-1 Feature is deprecated and new usage is disabled by default
N Feature is removed and cluster cannot be activated until all feature usage has been removed

Notes

RaduBerinde commented 2 years ago

I think the rewrite facility would require manual input. In some cases we can fix up an expression with an equivalent one, but not always.

Note that in many cases, the backward-incompatible change fixes a bug and we don't want to have an entire major version (N-1) with the bug still present.

ajwerner commented 2 years ago

@ajwerner Can we treat this as a generalized scenario of the interleaved feature removal?

In both this case and that case I wonder if we're missing some machinery that we'd like to have. Namely, the plan of record is that we're going to run the checks for the interleaved while in the mixed version state, mid migration. If there's a problem, the user will have to do something about it before being able to finalize.

The issue with this "check during migration" approach alone is that if the user decides that doing something about it is hard and that maybe that don't want to upgrade, then they're not in a very good spot. They cannot roll back to the old version. We've said this is fine in cases with phases to migrations and have given the user notice. As Radu is pointing out, this won't always be the case.

The additional machinery I'd like to see is, for migrations which are really about gating functionality based on some property of the state of the system we can query, would be to run all of these checks before even beginning to think about moving version gates (and thus locking out the old version).

I don't think that such hooks are very hard to build, but it is another thing to do.

ajwerner commented 2 years ago

I typed up the idea here: https://github.com/cockroachdb/cockroach/pull/68242

@RaduBerinde would this interest you?

RaduBerinde commented 2 years ago

Yeah, that kind of hook makes sense.

ajwerner commented 2 years ago

SQL schema is going to deliver the pre-condition thing here. I think it's on either @cockroachdb/sql-experience or @cockroachdb/sql-queries to write those checks when changing volatilities of functions.

vy-ton commented 2 years ago

@otan @rafiss Once Schema introduces the pre-condition checks, should we consider doing so for our discussion here about the parsing a date from a string would incorrectly assuming YMD format instead of the MDY format if the date was formatted using the two digit format: "YY-MM-DD" instead of "MM-DD-YY"

The general sentiment is that we want to be more rigorous with backwards-incompatible changes and prevent users from getting into a bad situation.

Andrew did note the difference between detecting feature usage between schema vs queries. We can more easily detect use in schema over queries.

ajwerner commented 2 years ago

The pre-condition stuff has been merged so sending this back over to @cockroachdb/sql-experience and @cockroachdb/sql-queries.

otan commented 2 years ago

Once Schema introduces the pre-condition checks, should we consider doing so for our discussion here about the parsing a date from a string would incorrectly assuming YMD format instead of the MDY format if the date was formatted using the two digit format: "YY-MM-DD" instead of "MM-DD-YY"

unfortunately this is a parse time thing, i'm not sure we can do much at the schema level to check for this (would need to look at the data).

vy-ton commented 2 years ago

checks when changing volatilities of functions

@otan If we enable DateStyle by default in a post 21.2 release, we would be changing a function's volatility, right? Then there would be certain functions that would be disallowed in computed columns?

RaduBerinde commented 2 years ago

@otan If we enable DateStyle by default in a post 21.2 release, we would be changing a function's volatility, right? Then there would be certain functions that would be disallowed in computed columns?

Correct.